KSitla Central liDrarn 

PILANI (Jaipur State) 


Clasi No :- ^ ^ O * / -2- 




Book No:- )fc>:2 

' I 


Acces$}oni No ;> 

4|pir^ • 






INTERNATIONAL SERIES IN PHYSICS 

LEE A. DuBRIDGE, Consulting Editor 


THE FUNDAMENTAL PRINCIPLES 

OF 

QUANTUM MECHANICS 


This complete edition is produced in full com- 
pliance with the government's regulations for 
conserving paper and other essential materials^ 



INTERNATIONAL SERIES IN PHYSICS 

LEE A. DuBlilDGE, Consulting Editor 


Backer and Goudsmit — Atomic Energy States 

Bitter — Introduction to Ferromagnetism 

Clark — Applied X-rays 

Condon and Morse — Quantum Mechanics 

Curtis — Electrical Measurements 

Davey — Crystal Structure and Its Applications 

Edwards — Analytic and Vector Mechanics 

Eldridge — The Physical Basis of Things 

Hardy and Perrin — The Principles of Optics 

Harnwell — Principles op Electricity and Electro- 
magnetism 

Harnwell and Livingood — Experimental Atomic Physics 
Houston — Principles of Mathematical Physics 
Hughes and DuBridge — Photoelectric Phenomena 
Hund — High-frequency Measurements 
Hund — Phenomena in High-frequency Systems 

KemhU — The Fundamental Principles op Quantum 
Mechanics 

Kennard — Kinetic Theory of Gases 
Roller — The Physics op Electron Tubes 
Morse — Vibration and Sound- 

Muskat — The Flow op Homogeneous Fluids through 
Porous Media 

Pauling and Goudsmit — The Structure op Line Spectra 

Richtmyer and Kennard — Introduction to Modern 
Physics 

Ruark and Urey — Atoms, Molecules and Quanta 
Seitz — ^Thb Modern Theory op Solids 
Slater — Introduction to Chemical Physics 
Slater — Microwave Transmission 

Slater and Frank — Introduction to Theoretical 
Physics 

Smythe — Static and Dynamic Electricity 
Stratton — Electromagnetic Theory 
White — Introduction to Atomic Spectra 
Williams — Magnetic Phenomena 


Dr. F. K. Richtmyer was oonstdting editor of the aeries from its 
inception in 1029 until his death in 1939. 





THE 

FUNDAMENTAL PRINCIPLES 
OF QUANTUM MECHANICS 

With Elementary Applications 


BY 


EDWTN C. KEMBLJ<] 


Profisiior of Physics, TTarvard L uivtrsity 


li'iRsr Edition 
Tut HD Imphession 


McGRAW-HTLL BOOK COMPANY. Inc. 

NEW yOKK AND LONDON 
1937 



Copyright, 1937 , by the 
McGraw-Hill Book Company, Inc. 


PRINTED IN THE UNITED STATES OP AMERICA 

All rights reserved. This bookj or 
parts thereof f may not be reproduced 
in any form without permission of 
the publishers. 


THE MAPLE PRESS COMPANY, YORK PA. 



IJEDIC'ATKD TO 

DAYTON CLARENCE MIU.ER 




PREFACE 


This volume was originally intended to be an expansion of a summary 
of the elements of quantum meelianics written some years ago for the 
Reviews of Modern Physics by the author in eollaboration with Professor 
E. L. Hill. The ])oint of view is essentially the same as in the summary, 
but as the present work has grown in my hands it has lost most of its 
resemblam^o to the initial pattern. 

The method of approach was dictated by the desire to meet the needs 
of graduate students of physics. For this reason the argument is induc- 
tive in form and applications of the ther)ry have been intc^rwoven with 
the development of the l)asic mathematical structure. In order to 
minimize the necessity for frequent consultation of mathematical refer- 
ence books, a good deal of background mathematical material is included 
in Cduip. IV. 

In reading other treatises on quantum theory I have frequently be(m 
distressed by tlie tendency to gloss over the numerous mathematical 
uncertainties and pitfalls which alxrnnd in the subject. From the stand- 
point of the beginner there is much to be said for this practice of minimiz- 
ing the defects of the theory in order to exhibit its main outlines in a 
compact and attractive form. Nevertheless it has seemed to me that a 
book which deliberately called attention to the weak spots in the argu- 
ment would be of considerable value to teachers and to students of the 
more mature type. The work of the math(^mati(^ian von Neumann 
provides a masterly antidote to the lack of rigor characteristic, of the 
average i)hysi(‘ist, but by common consent this work is too diffi(*,ult for 
any but the most mathematical students of this subject. Hence I have 
been led to try my hand at bridging the gap betwecui the exacting te(^h- 
nique of von Neumann and the usual less rigorous formulations of the 
theory. In carrying this project through I have restricted the discussion 
to such elementary mathematical methods as are the common property 
of physicists today. The reader must judge my success in avoiding the 
Scylla of sloppy thinking and the Charybdis of tedious complexity. 
Fine print, starred sections, and appendices indicate portions of the 
material which may well be omitted or briefly scanned on first reading. 

A feature of the pr(\sent volume on the physical and philosophical 
side is its consistent emphasis on the operational point of view and on the 
fundamental importance of Gibbsian assemblages of independent sys- 
tems in the physical interpretation of the mathematical formalism. 

vii 



viii 


PREFACE 


A considerable collection of references indicates the author^s indebted- 
ness to the ideas of others, but the list is by no means oxhaustivx\ I have 
borrowed freely from other books and am particularly indebted to those 
of von Neumann, Dirac, and of Born and Jordan. 

It is a pleasure to thank my colleagues and former colleagues, Dr. 
Eugene Feenberg, Dr. W. H. Furry, Professor J. C. Slater, and Professor 
J. H. Van Vleck for invaluable suggestions and generous assistance. I 
am particularly indebted to Professor Van Vleck for reading the entire 
manuscript and for his constant encouragement. Dr. Montgomery H. 
Johnson is responsible for much of the work on the continuous spectrum 
in Sec. 31, while Dr. Bela Lengyel and Dr. Charles H. Fay have at 
various times spent long hours in checking equations and other technical 
assistance. The author is very grateful to the librarian of the Harvard 
Physics Laboratory, Mrs. Miner T. Patton, for her cheerfulness and 
accuracy in the repeated typing of successive editions of the manuscript. 

To the Milton Fund of Harvard University I am indebted for a 
generous grant for technical help in preparing the manuscript for the 
printer. 

Edwin C. Kemble. 

Peach AM, Vermont, 

Augxmt^ 1937 . 



CONTENTS 

Pagk 

Preface yjj 

Notation xvii 

Reference Abbreviations xviii 

CHAPTER I 

Introduction to Dualistk? Theory of Master; Development op Schrod- 
inger’s Wave Equation 1 

1. Ilistorical Introduction 1 

2. Th(j Dualiatio Theory of Radiation 4 

3. An Analogy between Geometrical Optics and CUassical Mechanics. ... 7 

4. Wave Packets and Group Velocity 10 

.5. The Schrodinger Wave Equation for a Single Particle 14 

5a. The Time-free Equation 14 

56. The Second (General) Schrodinger Equation 15 

*f). Th(i Application of the Restricted Relativity Principle 19 

7. The Wave Ecpiation for a System of Many Particles 21 

7a. Formulation of Equation 21 

76. Relation of Schrodinger Equation to the Classical Hamiltonian 

Funct/ion 23 

*7c. Th(^ Scdirodinger Equation and the Hamilton-Jacobi Equation , . 24 

*7d. The Wave Equation for a System of C’harged Particles in a Classical 

Extt'rnal Electromagnetic Field 26 

8. The Physical Interpretation of the Wave Function and the Normalization 

Condition 29 

8a. Probability and Quadratic Integrability 29 

86. Normalization and Mass Curnmt Density 31 

Sc. A System Consisting of Two Independent Parts 33 

CHAPTER IT 

Wave Pac kets and the Relation between Classical Mechanics and Wave 
Mechani(%s 35 

9. Wave Packets and Group Velocity in a One-dimensional Homogcmeous 

Medium 35 

9a. The Fourier Integral Theorem 35 

96. Derivation of Group-velocity Formula 37 

10. Wave Packets in Thrc^e Dimensions 41 

*11. Wave Surfaces and the Hamilton-Jacobi Equation of the Classical Dyna- 
mics 43 

*12. Wave Packets and the Motion of Particles in a Force Field; Fermat’s 

Principle 46 

13. Direct Rigorous Proof of Newton’s Second Law of Motion for Wave 

Packets 49 

* An asterisk before the number of a section or subsection indicates that the section 
or subsection so marked may be omitted or skimme'^ to advantage on first reading. 



X 


CONTENTS 


Page 

14. The Statistical Interpretation of the Wave Theory of Matter 51 

14a. Review of Assumptions. . . ^ 51 

146. Necessity of Introducing Assemblages 53 

14c. Multiplicity of Energy and Momentum Values for a Definite State . 56 

1 5. The Wave Function and Measurements of Linear Momentum 58 

15a. Operational Definition of Momentum for Free Particles 58 

156. Computation of Momentum Probabilities from a Wave Function . . 60 

*15c. Monnmtuin of Center of Gravity 63 

*15d. The Measurement of the Momenta of Partich^s Moving in a Forci^ 

Fi(dd 66 

*15c. Individual Mormmta of Particles in a System 68 

15/. Summary: The Determination of the Wave Function of an Assem- 
blage 09 

16. The Heisenberg Untjertainty Principle 72 

CHAPTER III 

One-dimensional Energy-level Problems 78 

17. Boundary and Continuity Conditions; Eigenvalues and Eigenfunc- 

tions 78 

18. The One-Dimensional Anharmonic Oscillator 81 

19. The Qualitative Behavior of the Integral Curves: Existence of Class A 

Eigenfunctions 82 

19a. Behavior of Integral Curves in Regions of Positives and Negative 

Kinetic Energy 82 

196. The Discrete Eigenfunctions 84 

19c. The Continuous Spectrum of Class B Eigenfunctions 85 

1 9(1. The Paradox of the Nodes 86 

20. The Planck Ideal Jjinear Oscillator 87 

20a. The Sommerfeld Polynomial Method 87 

206. Determination of the Eigenvalues 89 

20c. The Eig(nifun(!tions and Their Properties 89 

21. An Approximation Mcithod Which Correlates the Eigenvalues of Wave 

Mechanics with the Energy Levels of the Bohr Theory 90 

21a. The B. W. K. Approximations for \l/{x) 90 

216. Application to Eigenvalue Problem 93 

*21c. Zwaan^s Method and the Stokes Phenomenon 95 

*2 Id. Analysis of the Stokes Phenomenon 97 

*21e. Derivation of the Connection Formulas 100 

*21/. Derivation of the Sommerfeld Phase-integral (iuantum Condition . . 103 

*21^. Higher Approximations 107 

*216. Modification of Method for Radial Motion in Two-particle Problem 107 
21i. Asymptotic Agreement of Wave Theory and Classical Theory 

Regarding Position of Particle 108 

21/. The Transmission of Progressive Matter Waves through a Potential 

Hill 109 

CHAPTER IV 

The Mathematical Theory of Complete Systems of Orthogonal Func- 
tions 113 

22. Scalar Products and %stems of Orthogonal Functions 113 

22a. Expansion in*a Series of Functions 113 

226. Comparison of Properties of Vectors and Functions 114 



CONTENTS 


X] 


22c. Scalar Products of Vectors 115 

22d. Scalar Products of Quadratically Intc^grablc Functions of m Variables 116 

22c. Spaces of Infinitely Many Dimensions 119 

22/. Proof of Orthogonality of Eigenfunctions of the One-dimensional 

Anharmonic Oscillator Problem 120 

23. Self-adjoint Operators and Equations. The Sturm -Liouville Problem. . 121 

23a. Self-adjoint Differential Operators in One Dimension 121 

236. Orthogonality with Respect to a Dcnsit y Function p 123 

23c. The Sturm-Liouvilh^ Problem 124 

23d. Singular-point Boundary Conditions 125 

*23c. Existen(;e of Discrete Eigenvalues for Sturm-Liouville Problems 

with Singular End Points 128 

24. Reduction of Eigenvalue Problems Based on Self-adjoint Differential 

Equations to Variational Form 130 

25. Completeness of System of Discrete Eigenfunctions of a Stiirm-Iaouville 

Problem 132 

*25a. The Eigcmvaliuis as Absolute Minima 132 

*256. The Expansion of Arbitrary Functions in Terms of Eigcinfunctions . 135 

(CHAPTER V 

The Discrete Ei^rgy Spectrum of the Tvvo-PAinTc^LE Central-field 
Problem 140 

26. The Behavior of Solutions of an Ordinary Second-order Differential Equa- 

tion near a Singular Point 140 

27. The Legendre Polynomials 143 

27a. Gencu'al PropcTtic's of the l^egcmdre Ec|uation 143 

276. Explicit Determination of Eigenvalues and Eigenfunctions. . . . 144 

28. The Energy I.jtwels of the Two-particle Problem 146 

28a. The Wave Eejuation • 146 

286. Separation of the Variables 146 

28c. The Azimuthal Factor of the Wave Function 147 

28d. Dc'tcrm illation of 0(0) and Its Eigcmvalues 148 

28r. CornpleteiK'ss of System of Eigenfunctions 149 

28/. Physical Interiiretation of Quantum Numbc'rs I and ni 150 

2Sg. Behavior of Radial Wave Functions at Boundary Points 152 

286. The Dumbbell Moded of the Diatomic Molecmle 155 

29. The Hydrogen ic Atom 157 

29a. Application of the B. W. K. Method 157 

296. Application of Polynomial Method 158 

29c. Generalized Laguerre Polynomials 160 

29d. The Most General Eigenfunction 161 

CHAPTER VI 

The Continuous Spectrum and the Basic Properties op Solutions of the 
Many-particle Problem 162 

30. The Continuous Spectrum in One-dimensional Problems 162 

30a. The Nature and Use of the Eigenfunctions of the Continuous Spec- 
trum 162 

306. The Weyl Theory 163 

*30c. Formal Treatment of Continuous Spectrum as the Limit of a Dis- 
crete Spectrum 165 

*30d. The Spacing of Energy Levels in Problems jS and a 168 



xii CONTENTS 

Paok 

*30e. The Eigendiffcrentials 169 

*30/. Passage from the Completeness Theorem for Problem jS to That for 

Problem a 171 

30^. The Fourier Integral Formulas a Special Case 173 

30/j. Normal Packet Functions in One Dimension 174 

30i. Normal Packet Functions for the Two-particle Problem 174 

*30/. Normalization of the Class B Radial Eigenfunctions for the Hydro- 

genic Atom 176 

31. Weak Quantization. Theory of Radioactive Emission of Alpha Particles 178 

31o. Weak Quantization in General 178 

316. A Model for Alpha Particle Disintegration 179 

31c. Resonant Energy Intervals 181 

31d. Encirgy Distribution in Weakly Quantized States 186 

31c. The Disint(5gration Process 187 

*31/. Complex Eigenvalues 192 

32. The Existence and Properties of Solutions of the Many-particle Schrod- 

inger Eigenvalue-eigenfunction Problem 195 

320. Introduction 195 

326. New Boundary Conditions for Physically Admissible Wave Func- 
tions 197 

*32c. Approximating Arbitrary Quadratically Integrable Functions by 

Means of Class I) Functions 201 

32d. Hermitian Character of the Hamiltonian Operator 202 

32c. Reduction of the Eigenvalue-eigenfunction Problem for Discrete 

Spectra to^ Variational Form 206 

32/. A Lower Bound for the Energy Integral 207 

*32g. Behavior of Solutions of the Differential Equation at Singular 

Domains 208 

*326, The Auger Effect 213 

321, The Discrete Eigenfunctions of the Differential Equation as Mini- 

mizing Functions 214 

32/. The Continuous Spectrum and the (.Completeness of the System of 

Eigenfunctions 215 

32fc. Degeneracy 217 

CHAPTER VII 

Dynamical Variables and Operators 219 

33. The Mean Values of the Cartesian Coordinates and Conjugate Linear 

Momenta 219 

зза. The Statistical Mean Values of the Coordinates 219 

ззб. The Linear Momentum Operator 220 

34. The Angular-momentum Operators 224 

34a. Definition of Operators 224 

346. Hermitian Character of Angular-momentum Operators 225 

34c. The Expansion Theorem 226 

34d. Angular Momentum of a System of Particles 227 

34e. Mean Values 229 

34/. The Vector Angular Momentum and Its Square; the Symmetric Top 230 

35. The Energy Operators 234 

35a, Calculation of Probabilities and Mean Values of Energy 234 

^356. Transformation of Hamiltonian Operator 237 

36. Dynamical Variables in General 240 



CONTENTS 


xiii 

Page 

360. Remarks on the Value of the CJeneral Theory 240 

366. Possibility of Defining Physical Quantities by Operators ..... 242 
36c. The Transformation of Probability Amplitudes and Dynamical 

Variables 245 

36d. Type 1 Operators as Dynamical Variables 248 

36c. Calculation of Probabilities 256 

36/. Type 2 Operators as Dynamical Variables; the Method of von 

Neumann. 259 

36flr. The Method of Dirac and Jordan 265 

*366. Multiplication Operators in Many Dimensions 268 

361. Transformation of Probability Amplitudes from One Arbitrary Coor- 

dinate Scheme to Another 270 

36/. Dynamical Variables with Complex Eigenvalues 275 

CHAPTER VIII 

Commutation Rules and Related Matters 278 

37. Simultaneous Eigenfunctions and the Commutation of Dynamical Vari- 

ables 278 

37a. Operator Algcibra 278 

376. Functions of a Single Operator 279 

37c. Commutative Operators 281 

37d. Functions of a Normal Set of Commuting Dynamical Variables. . . 286 

38. The Conservation Laws 288 

38a. (k)nservation of Energy 288 

386. Variation of Energy When the Hamiltonian Depends on the Time . 289 

38c. Conservation of an Arbitrary Dynamical Variabh; 290 

38d. Commutation Properties of the Hamiltonian and the Angular 

Momentum 291 

39. Conjugate Dynamical Variables and Quantum-mechanical Equations of 

Motion. 293 

39o. Conjugate Dynamical Variables 293 

396. Functions of Non-commuting Linear Operators 300 

39c. An Operator Form of Hamilton's Equations of Motion 301 

40. Symmetry Properties of the Wave Equation 303 

40a. Symmetry Properties in General 303 

406. The Reflection Operators 305 

40c. The Rotation Operator 306 

40d. The Permutation Operators 308 

40c. Degeneracy and the Integrals of the Schrodinger Equation 310 

40/. The Normal Degene'-atjy of the Energy Levels of Free Atomic 

Systems 313 

CHAPTER IX 

The Measurement op Dynamical Variables 318 

41. General Theory of Measurement 318 

41o. Fundamental Characteristics of Measurements 318 

416. Pure States and Mixtures 320 

41c. Postulates Regarding Retrospective and Predictive Measurements. 322 

41d. The Reduction of the Wave Packet 326 

41c. Classical Orbits and Wave Packets 331 

42. More About Measurements 334 



XIV 


CONTENTS 


Page 

42a. Conjugate Variables and M(‘asuremcnts 334 

426. Impossibility of Measuromonts Which Imply Distinction between 

Particles of Same tSpeci(vs 335 

42c. A Classifi(jation of Observations 341 

42d. Measurements as Correlations 342 

42c. The Observing Mechanism Not Entirely Classical 343 

CHAPTER X 

Matrix Theory 348 

43. Matrix Algebra 348 

44. Mat rices and Operators 352 

44a. The Derivation of Matrices from Operators 352 

446. Canonical Matrix Transformations 355 

44c. Matrix Form of the Eigenvalue-eigenfunction Problem 359 

*44d. Matric(^s with Continuous Elenuaits 363 

45. The Matrix Theory of Heisenberg, Born, and Jordan 366 

45a. Fundamental Postulates 366 

456. Correlation of t he Heisenberg and S(‘hrodinger Theories 368 

45c. Solution of Mat.rix Equations of Motion for an Ideal TJnear 

Os(dllator 370 

45d. Reduction of the Fundamental Problem of Matrix Mechanics to a 

Principal-axis Transformation 372 

46. The Bohr (>)rreHpondence Principle and Its Relationship to Matrix Th(*-ory 374 

46a. The Bohr Postulates 374 

466. The Bohr Correspondence Principle and the Heiscmberg Matrix 

Theory 375 

CHAPTER XI 

Theory of Perturbations Which Do Not Involve the Time 380 

47. The Perturbation Theory for Nondegenerate Problems 380 

47a. First-order Perturbations 380 

476. Second-order Perturbations 384 

47c. An Example: The Diatomic Molecule. 386 

48. The Perturbation *Theory for Degenerate Problems 388 

48a. First-order Energy Perturbations 388 

486. Second-order Energy Perturbations. 391 

*48c. Van Vleck^s Method for Second-order Perturbations 394 

48d. Simplification of Pert urbation Calculations by Means of Integrals of 

the Perturb(5d Hamiltonian 396 

49. The Energy Levels of an Hydrogenic Atom in a Uniform Magnetic Field 

(Spin Neglected) 398 

49a. Derivation of Hamiltonian Operator 398 

496. Legitimacy of the Perturbation Method r. 399 

49c. First-order Energy Correction; Relation to Magnetic Moment and 

Larmor Precession 400 

49d. The Second-order Energy Correction . 402 

50. The Energy Levels of an Hydrogenic Atom in a Uniform Electric Field. . 403 

51. The Variational Method 408 

51a. Reduction of the Variational Problem to Algebraic Form 408 

516. The Ritz Method 410 

*51c. Higher Roots of the Secular Equation 415 

51d. General Observations Regarding the Use of the Variational Method 416 



CONTENTS XV 

Page 

51^^. Modifications of the Method; Construction of Eigenfunctions from 

Non-orthogonal System 418 

52. The Problem of tlie Hydrogen Molecule 419 

52a. Th(^ Fixed-nuclei Prohh^m 419 

525. The Heitler and London Calculation 420 

52c. The Method of James and Coolidge 425 

CHAPTER XII 

Quantum Statistical Mechank s and the Einstein Transition Probabilities 427 

53. (Juantum Statistical Mechanics 427 

53a. The General Theory of Perturbations Which May Involve the Time 427 

535. The Adiabatic, Theorem 431 

53c. The Fundamental Probhuns of Quantum Statistical Mechanics . . 432 

53c/. The C'onv(mtional Characterization of a Chaotic Assemblage . . . 434 

53c.. Transition Probabilities and Statist i(‘al Eeiuilibriuni for C^haotic 

Assemblage's 439 

53/. The Gibbs C -anonical Assemblages for Systems of the^ Most General 

Type 440 

54. The Absorption aiiel Emission of Radiatie)n: Perturbation of an Atomic 

System by a (Jassical Radiation Field 448 

54a. The Einstein Derivation e)f the Planck Raeliation Formula .... 448 
545. Elementary Approae^he^s to the Quantum Thee>ry of the Einstein 

Transition Probabilitms 450 

54c. The Perturbing Hamilte)nian fe)r a Classical Radiation Field . . . 462 

54ef. The Born 'Prarisitie)n Probability 454 

54c'. The Einstein Transition Pre)babilitit's 458. 

54/. Spe'ctros(*,oi)ic Stability 462 

*54fir. Magned.ie^ Dipoles and Ele^ctric Quadrupole Radiation 462 

55. Some Ele'me'nt.ary Selection Rule's for Ek'ctric Dipole Radiation 469 

55a. The Harmonic Oscillator 469 

555. Selectie)!! Rules for the Two-particle Problem 470 

55c. Fine Structure and Polarization of Spectrum Linejs in Simple Zeeman 

Effect 471 

CHAPTER XIIT 

Introduction to the Problem of Atomic Structure: Electron Spin. . . . 474 

56. The Atemie*. Problem as a Two-partiede Problem 474 

56a. The Empirical Basis for t he Idealized Bohr Atom Model 474 

565. Derivation of the? Ritz Formula 478 

57. The Bohr Assignment of Electronic Quantum Numbers 481 

57a. The Quantum Numbers e)f the Valene?e Electrons in the Spectra of 

'the Alkalies and Alkaline Earths 481 

575. Perturb- tion Theory and the Significance of an Assignment of Quan- 
tum Numbers to Inner Electrons 484 

58. The Electron-spin Hypothesis 491 

58a. The Empirical Fine Structure of Spectrum Lines 491 

585. The Ckmi bination of Angular Momenta 495 

58c. The Lande Magnetic Core Theory 498 

58(i. Solution of the Fine-structure Problem by the Electron-spin Hypo- 
thesis 500 

59. 'The Fine Structure of the Spectra of Atomic Systems with a Single Valence 

Electron 503 



xvi CONTENTS 

Page 

60. The Approximate Relativistic Theory of the Hydrogen Atom 507 

61. The Pauli Wav(vm(‘chajiieal Formulation of the Theory of Electron Spin 510 

61a. Nature of tlie Ocmfiguratioii Space and Wave Functions 510 

615. Preliminary Discussion of Spin Operators and Spin Matrices . . . 512 
61c. Application of th(? Pauli Theory to the Alkali Doublets 519 

CHAPTER XIV 

The Theory of the Stru(^ture of Many-Electron^ Atoms 523 

62. General Formulation of the Problem 523 

62a. The (kmfiguration Space 523 

626. The Hamiltonian ()i)erator 524 

62c. The P(‘rturbation Form of th(i G(;neral Atomic; Problem 526 

63. Problem B: The; Spin-orbit Energy N(;gleeted 528 

бза. Integrals of the Motion 528 

бзб. Antisymmetric Functions and the Empiri(;al Pauli Exclusion Rule 533 

63c. Closed Shells 536 

63d. Terms Originating in a Given (Configuration 537 

64. Selection Rules for El<M^fri(r Dipole Radiation 540 

64a. The Laporte Rule 540 

646. Sehrctiori Rul(;s for the ('entral-fi(4d Problem 541 

64c. Selection Rules for Problems B and C 543 

65. The Helium Atom and Exchange Phiergy 547 

65a. Two-electron Atoms 547 

656. The Ex(;hang(‘ Phenomenon 553 

66. Diagonal Sums and the Proldem B Energy Levels 555 

Appendices 557 

A. The (Calculus of Variations and the Principle of Least Action 557 

B. Derivation of Equation (15.7) 564 

C. Theorems Regarding the Linear Oscillator Problem 568 

D. Mathematical Notes on the B.W.K. Method 572 

E. The Reduction of Certain Boundary-value Problems Based on Self-adjoint 

Differential Equations to Variational h'orm 579 

P\ The Legendre Polynomials and Associated Legeiidre Functions 583 

G. The Generalized Lagiierre Orthogonal Functions 585 

H. Two Theorems Rcdating to the Continuous Spec-trum 588 

I. Concerning the PJxpansion of Hf in Spherical Harmonics 592 

J. The Jacobi Polynomials 594 

K. Schlapp^s Method 596 

Name Index - 599 

Subject Index 603 



NOTATION 


The number of different physical and mathematical quantities to be 
represented by separate symbols in this book is embarrassingly large in 
comparison witli the available letters of the Roman and Greek alphabets. 
For this reason the establishment of a one-to-one correspondence of 
symbols and meanings has proved impractic^able. The author has 
endeavored to keej) the notation consistent within each cliapter and, with 
a few exceptions which should not be confusing, has used only one symbol 
for each well-defined and recurrent uK^aning. 

The following notes may be of use to the reader who attempts to dip 
into the middle of the book. 

An asterisk * us(d as a superscript denotes the comi)lex conjugate of 
the number or function in question. 

Ordinarily the symbol 'k denotes a time-dependent wave function, 
while ^ indicates the time-free space factor of a monochromatic or single- 
energy At times ^ is also used for the instantaneous form of a general 
wave function. 

Vectors are indicated by superior arrows. 

Thre(i-dimensi()nal vector and scalar products are indicated by the 

conventional X and •, e.g., A X B and A • R, 

The scalar products of many-dimensional complc^x vectors and of 
functions are denoted by heavy parentheses, c.(/., 

(2, i) = ^kAJh^, 

i.'l' (^), <p(^)) = f 4'<P*dx. 

In Chap. IV the nurm of a function /, viz., (/,/), is indicated by Nf, 
while the magnitude., or scpiare root, of the norm is indicated by ||/||. 

Sa' denotes a mixed process of summation and iutc^gration over all 
eigenvalue points in a'-space. Cf. p. 246. 

Matrices arc denoted by boldface type or by a typical element 
enclosed in double •vertical rules. Thus, 

H = l|//(TO,n)||. 

The first of the two indices of the typical element of an ordinary two- 
dimensional matrix indicates the row, while the second denotes the 
column. 

The Dirac notation for an eigenfunction of a in x'-space, viz., (x'|a')i 
is introduced in Sec. 36/i, w'hile the Dirac notation for matrix elements, 
e.g., (fi"\yW), appears in Sec. 44d. The Dirac symbolism is employed 
only at points where it is particularly convenient. 

xvii 



D. P, 

E. Q. 

M. G. Q. 
M. M. P. 

P. Q, M. 

Q. M, 

T. A. S, 


REFERENCE ABBREVIATIONS 

Differentialgleichungen der Physik^ Riornann-Weber, Braun- 
schweig, edition of 1927. 

Elementarv Quanicnmechanik^ M. Born and P. Jordan, Berlin, 
1930. 

Mathematiscke Grundlagcn der Quanicnmechaniky J. v. Neu- 
mann, Berlin, 1932. 

Methoden der Matheniatischcn Physik /, R. Courant and 1). 
Hilbert, Berlin, 2d ed., 1931. 

The Principlcfi of Quantum Mechanics, P. A. M. Dirac, 
Oxford, 1st ed., 1930; 2d ed., 1935. 

Quantum MechanicSj E. U. Condon and P. M. Morse, New 
York, 1929. 

The Theory of Atomic Spectra, E. U. Condon and G. H. Short- 
ley, Oxford, 1935. 


xviii 



FUNDAMENTAL PRINCIPLES 
OF QUANTUM MECHANICS 

WITH ELEMENTARY APPLICATIONS 

CHAPTER I 

INTRODUCTION TO THE DUALISTIC THEORY OF MATTER; 
DEVELOPMENT OF SCHRODINGER’S WAVE EQUATION 
1. HISTORICAL INTRODUCTION 

The first st(‘p toward tho forimilation of the quantum theory was 
made in 1900 })y Max Plaiiek in the course of a theoretical investigation 
of the laws of thermal radiation.^ His problem was to explain the 
distribution of (‘uergy in the continuous si)ectrum of a heated black 
body as a fuiuition of its temperature. The experimental fa(d; is that the 
intensity i)er unit frequency interval rises from zero at very low fre- 
(pieneies to a maximum value whose position and magnitude depend 
upon the temi)eratiire - then falls again, approacdiing zero at very high 
frequencies. The drop in intensity in the high-frequency region is in 
violent conflict with a theoretical result previously obtaintid by the 
elder Lord Rayleigh^ on the basis of the equi partition theorem derived 
from the classical statistical mechanics and of the wave theory of light. 
Planck attributed the discrepancy to the breakdown of the equipartition 
theorem when applic^d to high-frecpieiK^y oscillations and madcj the 
brilliant suggestion that, if the vibrating matter particles which emit 
radiation have motions restricted to certain discrete energy values, or 
energy levels, there would be a departure from the laws of the classical 
statistical mechanics of the sort required by the experimental facts. 
On the basis of this hypothesis Planck was able to derive a formula for 
the intensity of the radiation in terms of temperature and frequency 
which fits the empirical data within the limits of experimental uncertainty. 

The acceptance of Planck's suggestion meant a comi)lete revolution 
in physics since it was incompatible with both the Newtonian mechanics 
and the electromagnetic theory of light. As corollaries, one may infer 
at once that the mechanics of the collisions between atoms are completely 

^ M. Planck, Verh. d. deut. physik. Gesell. 237 (1900); Ann, d, Phygik 4, 563 
(1901). 

* Ratleigh, Phil. Mag. 49, 539 (1900). 

1 



2 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


nonclassical and that the light is emitted and absorbed in discrete? 
‘‘parcels” or “quanta.” A natural inference is that radiant energy is 
corpuscular in character. Planck himself was unwilling to entertain 
so radical an hypothesis and spent much time and energy in an ultimately 
fruitless attempt to save the wave theory of light by a modification of his 
original energy-level assumption. In fact so firm was the hold of the 
wave theory on the minds of all physicists that it was not until 1905 
that Einstein,^ then a young man of 26, seriously revived the corpuscular 
conception of the nature of light. 

The application of Planck^s hypothesis to the problem of the structure 
of matter was still further delayed. Although Einstein showed in 1907 
that it contained the key to the problem of the low-temperature specific 
heats of solids, it was not until 1913 that Bohr- united the Rutherford 
nuclear concieption of the atom with th(? energy-h'vel hypoth(?sis to 
formulate his famous theory of the structure and spectrum of hydrogen. 

The initial success of the young Daiie^ was followed by a period of 
rather feverish, but fruitful, activity for both experimental and theoretical 
physicists. New discov(?ries followed one another in bewildiTing suc- 
cession, and in a short time scientific understanding of the nature of 
matter was immeasurably deepened. Prior to 1924, however, the 
theoretical developments were largely of an essentially provisional 
character. At that time the battle between the advocates of the wave 
theory of light and the proponents of the corpuscular theory had led to an 
unsatisfactory stalemate. Those who favored the corpuscnilar theory 
had made it abundantly clear that radiation has many of the properties 

1 A. Einstein, Ann. d. Physik (4) 17, 132 (1905); 20, 199 (1906). Einstein’s first 
paper on the theory of relativity was sent to the publishers less than 4 months after 
his first paper on the corpuscular theory of light! 

* N. Bohr, PhU. Mag. 26, 476, 857 (1913). 

3 Bohr was twenty-eight years old when he published his first paper on the theory 
of the hydrogen spectrum. In fact, the quantum theory has been from first to last a 
development by young men. Einstein, as already mentioned, wrote his initial paper 
on the corpuscular theory of light at the age of twenty-six. Heisenberg was twenty- 
four years old when he laid the foundation of the matrix mechanics. Dirac and 
Jordan wrote their first important papers at the ages of twenty-four and twenty- 
three, respectively. W. Pauli, Jr., was already a figure of importance in theoretical 
physics when at twenty-five he formulated the exclusion principle which bears his 
name. Uhlenbeck and Goudsmit were twenty-five and twenty-three years of age, 
respectively, when they invented the spinning electron. L. de Broglie published his 
first paper on the wave theory of the electron at thirty-one, while Schrodinger’s most 
important papers on wave mechanics were written at the relatively advanced age of 
thirty-nine. Of course the contributions of older men, especially Sommerfeld and 
Bom, have been exceedingly valuable, but one cannot but be impressed with the 
importance to science of a system of education which enable^ young men to finish 
their preliminary training and start their career of productivity wliile the extraordinary 
mental energy of youth is still in full vigor. 



Sec. 1] 


HISTORICAL INTRODUCTION 


3 


to be expected ftom their model. But no satisfactory way of accountinja; 
for the characteristic wave phenomena of interfenuice and diffraction 
on the })asis of a pure particle theory had Ix^en found. There was 
abundant evidence of the reality of the energy levels postulated by 
Planck and Bohr— yet it had also become clear that Bohr^s makeshift 
combination of classical mechanics and ‘^quantum conditions’^ was 
inadequate for the working out of an exact theory of atomic structure. 
Moreover, apc^riodic phenomena and the problem of the interaction of 
atoms in the formation of molecules and solids were practically untouched. 

The temporary retardation in the prognjss of theoretical physics 
brought about by the limitations inherent in the Bohr theory was finally 
broken by th(‘ introduction of new fundamental hypotheses by Louis 
dc Broglie^ and Werner Heisenberg.^ To de Broglie we owe the sugges- 
tion that matter may share* the dualistic characteristics of radiation 
by combining the? })ro])erties of waves with those of corpuscles. To 
Heisenberg we owe^ a scheme for the exact description of atomic dynamical 
systems by me^ans of a new kinematics based on Bohr’s ‘^correspondence 
principlt\” De Broglie’s hyi)othesis in the hands of Schrbdinger* 
r(‘C(nved the definitive form jiow generally accepted, and Heisenberg’s 
method was convc^tcnl into a powerful matrix calculus by Born and 
Jordan.'^ Both suggestions, despite their extreme dissimilarity, proved 
to be of great value. Fused into a single theory which we call the 
“quantum mechanics,” they correct the deficiencies of the Bohr theory 
as a tool for investigating the structure of matter, relate the newly 
discovered diffraction of electron beams to the problem of locating 
atomic energy levels, and go a long way toward removing the dilemma 
regarding the nature of radiant energy. 

In the form of quantum theory now most generally accepted, the 
dualistic nature of radiation is treated as a fact to be described rather 
than explained or exorcised. In accordance with de Broglie’s hypothesis, 
a similar dualistic nature is ascribed to matter, and thus a unification 
in the treatment of matter and radiation is attained. The fundamental 
similarities between the assumed characteristics of matter and radiation 
form one of the most striking feature's of present physical theory. Differ- 
ences remain, to be sure — and we can by no stretch of the imagination 
identify these two modes of existence — but the analogy is far reaching 
enough to permit the use of observations regarding the characteristics of 
radiation as guides in the construction of a theory of matter. The 
de Broglie-Schrodinger wave mechanics is the result of a conscious 

iL. DE Broglie, Nature 112, 540 (1923); Thesis, Paris, 1924; Ann. de Physique 
(10) 8, 22 (1925). 

* W. Heisenberg, Zeils. /. Physik 33, 879 (1925). 

3 E. SchrOdinger, Ann. d. Physik 79, 361, 489 (1926). 

* M. Born and P. Jordan, Zeits. /. Physik 34 , 858 (1926). 



4 


INTRODUCTION TO DUAL! STIC THEORY OF MATTER [Chap. I 


attempt to follow such guides and affords a relatively easy method of 
approach to the general theory. 

2. THE DUALISTIC THEORY OF RADIATION 

The importance of optical analogy in the development of quantum 
mechanics lies primarily in the fact that the dualistic nature of light 
is much more obvious than the dualistic nature of matter. In the region 
of long waves the wavelike characteristics of radiation are strongly 
predominant, while in the X-ray n^gion the (jorpusciilar (*haracteristi(*s 
are more obvious. As the transition from one part of the spectrum to th(^ 
other is continuous, the dualism is inescapable. On the other hand, 
serious technical difficulties stand in the way of a dirc^ct exj)eriTn(^ntal 
study of long wave length matter waves — so that it is small wonden- 
that the wavelike aspect of the nature of matter was dis(;overed at a 
very late date. 

Let us therefore begin our study of the (piantum mechanic.s with a 
preliminary examination of the properties of radiation. As pn?viously 
stated, the result of the conflict between the wave theory of radiation and 
the corpuscular theory up to 1924 was a draw. The electromagnetic 
theory of Maxwell gave a simple and accurate^ account of int(^rferen(‘e, 
diffraction, and dispersion, besides making proper (connection with 
quasi-static electromagnetic phenomena in the limiting rc^gion of very 
long waves. The corpuscular tlnwy gave a simple and accurate account 
of the fundamental laws of the photoelectric effect and the Com])ton 
effect. It could be regardc^d as a logical corollary of the fundamental 
law of spectroscopy. 


E' - = hv, 

and it seemed necessary in order to account for the abrupt changes in 
momentum experienced by emitting and absorbing atoms and molecules 
in a radiation field. ^ Nccither point of view gave a satisfactory des(*rip- 
tion of the whole field of optics. In the (;ase of the Doppler effect,*^ 
the predictions of the two theories were identical and in agreement with 
experiment. In some ways the two theories supplemented each other. 
For example, in the case of the inverse photoelectric effect (i.e,, the 
production of the continuous X-ray spectrum) the corpuscular theory 
was needed to account for the sharply defined high-frequency limit to 
the spectrum, but the help of the wave theory was needed to account 
for the polarization of the radiation. Either point of view gave a qualita- 
tive explanation of the variation in hardness with direction of emission, 
which is, in fact, a kind of Doppler effect. 

1 A. Einstbin, Physik. Zeits. 18, 121 (1917). 

® E. ScHB5DiNaBR, Phyaik, Zeits. 23, 301 (1922). 



Sec. 2] 


THE DVALISTIC THEORY OF RADIATION 


6 


To obtain a satisfactory theory of light, one must formulate a descrip- 
tion of its behavior which elnbodi^s the characteristics of both of these 
conflicting points of view. As a first step toward the formulation of 
such a des(^ription, we observe that a similar controversy arose long ago 
in connection with matter. In bulk it has properties which are con- 
veniently d(\scribod from the continuum point of view. In particular, 
it may be the vehicle of sound waves which act like waves in a continuous 
medium. If matter be molecailar in structure, however, we must expect 
this fact to be most evident in the properties of low-density gases. 
Experiments made on such gases do favor the molecular hypothesis 
and are regarded as crucial since high-density matter must in any case 
a(^t in some ways lik(^ a continuum. Similarly, the corpuscular properties 
of radiation, if tlui}'' ('xist, must be most evident if the corpuscles are of 
great (energy and f(‘w in number as in the case of low-intensity beams of 
hard X-rays. Precisely here the evidence for atomicity through the 
C. T. 11. Wilson ray-track experiment and-the Duane-Geiger i)oint counter 
is most positive and definite. On the other hand, to get a test of one 
of the ])redic-tions of tlie wav(^ theory one must have a record of the 
ai)sorption of a quantity of light containing on the basis of the particle 
theory a very large number of photons. Thus one may say that inter- 
ferc'iice exi)erimeuts show that statistically light has the properties of 
waves witliout in any way directly disproving its granular structure. 
In other words, the experiments which were initially regarded as evidence 
agaiitst the corpin^cular theory are not to the point but actually show 
merely that tlie properties of the corpuscles are different from those to be 
expected by analogy with classical mechanics. 

We then lay down as an initial postulate the hypothesis of the ato- 
micity of radiation. As a second postulate, we assume in accordance 
with experiment that the wave theory gives a correct description of the 
average intensity distribution in ordinary interference and diffraction 
experiments. Thus, in predicting or describing the results of optical 
experiments, we make use of both concepts with the related mathe- 
matical machinery.^ For the detailed correlation of the energy E and 
momentum p with the frequem y v and wave length X we make use of the 
fundamental formulas due to Einstein: 

E = hv, (2-1) 

P = I- (2-2) 

1 One may, if one likes, assume that light consists of both waves and corpuscles, or 
he may say that it (*onsists of corpuscles guided by a “ghosE’ electromagnetic wave 
field [c/. W. F. G. Swann, Science 61, 433 (1925)]. The writer would prefer to regard 
both waves and corpuscles ultimately as mental aids in the d(W(;riptioii and prediction 
of empirical results, leaving all questions regarding their objective reality to the 
philosophers. 



6 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 

Here h denotes Planck’s constant as usual. These equations mean that 
radiation which is monochromatic from the point of view of the wave 
th(X)ry gives a single sharp spectrum line when analyzed by a 

spectrometer) consists of photons of energy E — hv and momentum 
p = h/\. We further suppose that in the case of a plane progressives 
wave the direction of the momentum of the associated parti(d(^s is that 
of the forward normal to the wave front. 

It is immediately evident that thes(' assumptions create a theory of 
radiation which unites the partial successes of the Maxwell theory with 
those of the corpuscular point of view. They do not answer all questions 
regarding the interaction of light and matter, but that is not to be 
expected without a fully developed theory of matter. Neither do they 
make any attempt to answer the question: Why does light act in some 
respects like an assemblage of corpuscles and in otluT respeu^ts like a 
spreading-wave phenomenon?’’ IiKstead, they dc.sm6c the diialistic^ 
behavior of light. As it is nnw generally recogniz(‘d that description 
rather than explanation is the true function of physical theory, this 
procedure is entirely cornu't. Furthermore, the assumptions carry 
with them the tacit or explicit admission that deterministic models’ 
are of little use in dealing with the radiation problem. In fact, the 
problem of the nature' of radiation as seen by physicists prior to 1925 
loses its point as soon as such deterministic mod(ils are (^ast aside. 

Statistics and indeterminism enter the theory of light when one 
assumes that the distribution of discrete corpuscles in space is to be 
calculated by means of continuous wave functions. This can only 
mean that the intensity of light of frequency v in any small volume G 
as computed from the wave functions is a measure of thci probable 
number of photons of energy hv in G. If the energy in G were measured 
n times, the individual measurements would necessarily show fluctuations 
or departures from the mean — although, according to the theory, the 
mean itself would approach the computed value as n becromes very large. ^ 

Corresponding to this theoretical indeterminism, there is an experi- 
mental indeterminism evidenced, for example, in the haphazard devel- 
opment of the grains on a photographic plate. The ideal test for 
determinism or indeterminism would be, of course, to perform repeatedly 
the same experiment with the ‘initial conditions” exactly controlled, and 
then see whether or not the results are. identical. As this is in practice 

1 We here use the term model” in a very broad sense to describe either the 
classical concept of a particle or of a wave with all the tacit assumptions formerly 
bound up with these concepts. 

* The necessity for indeterminism is also evident if we consider the problem of 
reflection. Waves can always divide themselves in a definite way at the interface of 
two media, but each individual photon must either be reflected as a whole or pass 
across the boundary. 



Sec. 3] GEOMETRICAL OPTICS AND CLASSICAL MECHANICS 7 

impossible, we (3an only examine the results under conditions as similar 
as possible to see whether the deviations from the mean of the results 
are commensurate with the uncertainty regarding the initial conditions. 
In the case of photons, the best we can do is to throw a b(‘am of plane- 
parallel light on a small aperture and allow the emergent b(»am to fall 
upon a photographic plate. There is then an uncertainty regarding the 
point at which any individual d(^velopable photographic grain will 
appear measured by the effective diameter of the illuminated portion 
of the plate. This uncertainty can be decreased to a certain limiting 
value by decreasing the area of the aperture. But, if the opening is too 
small, diffraction causes the illuminated area to increase once more and 
thus presents a complete barrier to an indefinite reduction of the experi- 
mental uncertainty. Thus an exact control of the ^Tuture ’^ is impossible 
in such an optical (‘xpcTinu^nt. Whether tliis is du(' to the fact that the 
initial conditions for tlie various photons cannot be exactly relocated, or 
to the fact that the initial conditions do not (exactly control the future, 
is a futile qufsstion since incapable of experinumtal investigation. From 
a practical standpoint, the field of optics is indeterministic and must 
remain so, unless some new mode of experiment is discovenni which 
permits a more (3xact control over the behavior of photons than any we 
now have . . If the union of particle-theory and wave-statistics outlined 
above is fundamentally correct, no such mode of exi)eriment can exist. 

3. AN ANALOGY BETWEEN GEOMETRICAL OPTICS AND CLASSICAL 

MECHANICS 

In order to formulate a theory of matter paralleling the dualistic 
theory of radiation sketched in the preceding section, we must invent a 
suitable differential equation for the wave function. This equation 
may be assumed to have a form similar t-o that of the wave equation of 
optics but must be so designed as to harmonize^ with the Newtonian 
mechanics in the limiting case when diffraction effects are negligible. 
If it is possible to set up such an equation, there should be a certain 
similarity between the Newtonian mechanics and diffractionless, or 
geometrical, optics. The desired vsimilarity does exist and is evident 
at once if one compares the principle of least time in optics (Fermat \s 
principle) with the principle of least action in mechanics.^ 

The optical principle states that the path of a ray of light (wave 
front normal) from a point ^4 to a point B is always such as to give the 
integral 

rt. 

JA W 

' The principle of least action originated in an attempt by Maupertuis to obtain 
for the corpuscular theory of light a theorem analogous to Fermat's principle (c/. 
H. T. Whittaker, Analytical Dynamics, 2d ed., p. 248, Cambridge, 1917). 



8 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


an extreme value (usually a minimum) with respect to the integrals 
over all other conceivable paths for rays of the saimi color, or frequcnc^y. 
In this formula w denotes the local phase velocity of light and is a function 
of the frequency and the si)a(^e (Coordinates, say x, y, z. As the fre(|uency 
V is treated as a constant in varying the integral, and as the local wave 
length X is equal to ic/p, we may substitute X for w in the statememi 
of the principle. Using the notation of the calculus of variations w(c 
have 


5 I = 0. (i/ unvaried) (3*1) 

Ja X(p, x,y,z) 

This means that the true path of the ray is characterized by the fact that 
for it the first variation of the path length measured in wave lengths is 
zero. 

The usual elementary derivation of FermaUs principle^ is valid only 
for light rays moving in homogeneous media bounded by plaiuc surfa(^(^s of 
discontinuity and is based on the assumptions of geometriccal optics, 
viz.j that light rays travel in straight lines in such media exceept at th(c 
boundaries where they are regularly reflected or refracted in accordances 
with the sine law 

sin _ wi 
sin 62 W 2 


The extension of the law^ to inhomogeneous media may be obtained by 
a limiting process in which the inhomogeneous medium is approximated 
by a hccterogeneous aggregation of homogeneous volume elements, each 
differing slightly in index of refraction from its neighbors. Since the 
spreading of light waves is fully determined by the ‘ Vave equation^' 


vv = 


1 dv 

w^{x,yyZjv) 


(3-2) 


we may conclude that Fermat's principle is deducible under suitable 
restrictions from this ecpiation. The mathematical verification of this 
theorem will be given at a later stage of our argument (Sec. 12, p. 46). 

Let us now compare Fermat's principle in the form of Eq. (3T) with 
the principle of least action. In the case of a single particle of total 
energy JJ, kinetic energy T, and mass moving through a force field 
with potential energy 7(^,1/, 2 ), the latter principle requires that the 
value of the action integral 


' Cf, P. Drudb, Theory of OpticSj Chap. I, Art. 2, English translation by Mann 
and Millikan, New York, 1917. 



Sec. 3] GEOMETRICAL OPTICS AND CLASSICAL MECHANICS 9 

for the natural, or mechanical, path between the two points A and B 
shall be an extremal as compared with its values for adjacent paths 
and the same value of E, For such a particle tlu^ intcjgrand in the second 
form of the integral, viz., \/2fjL{E — V), is the absolute value of the 
instantaneous momentum which the particle would assume at x, y, z 
if it had the energy E. Denoting this quantity by p{E,x,y,z), we may 
state the principle in the form (c/. Appendix A) 

dJ^p(E,x,y,z)ds = 0. {E unvaried) (3*4) 

It will be convenient to call p{E,x,y,z) the classical local momentum to 
distinguish it from the true (piantum mechanical momentum to be 
defined in Secs. 14 and 15 and from the lo(‘al momentum of Young. ^ 

Evidently Eqs. (8*1) and (3*4) present formally identical mathematical 
problems. In fact, the paths of particles in the Newtonian dynamics 
may be idemtified with th(^ ^^rays^’ of a problem in geometrical optics 
in which the wave length is adjusted to make the integrand of the 
integral of Eq. (3*1) proportional to the integrand of Eq. (3*4). To be 
precise, the condition imposed on the wave problem is that 

2 = ^ = v(E,x,y,z) = V2AE - V(x,yM (3-5) 

where C may be any function of v, and v in turn must l)e a function of E. 

Comparing Ecjs, (3*5) and (2*2), w^e o))S('rve that the required relation 
between the classic al loc^al momentum of thc^ ])article and the correspond- 
ing local wave length is identical, except for a possible constant factor, 
with the relation between momentum and wave length in the dualistic 
theory of light. 

The analogy between the variational principles of geometrical optics 
and particle dynamics was scazed upon by Sir William Hamilton in the 
early part of the nineteenth century and used as a guide in the develop- 
ment of dynamical and optical theory. The Hamilton- Jacobi partial 
differential equation is the fruit of this developmcmt, but Hamilton 
himself regarded the analogy as an anq|[ogy only. It remained for 
de Broglie and Schrodingcr to show that it may profitably be used as a 
stepping stone for the developmcmt of a true wave mc^chanics similar 
in form to physical rather than geometrical optics. 

If Hamilton had worked a little later, he might even have discovered 
Schrodingcr \s wave equation, for he would almost certainly have been 
led to it, if he had sought a wave equation for his waves. But in his 
time; the wave theory of light was just beginning to be investigated, 
and it was not yet the fashion to describe waves as solutions of a partial 

iL. A. Young, Phys. Rev. 38 , 1612 (1931), 89 , 455 (1932). 



10 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


differential equation. Hamilton contented himself, as was the custom 
in optics, with investigating the positions of the wave fronts of his waves. 
He found the wave length of mechanical waves as it depends on position, 
the index of refraction as a function of the potential (uiergy, and so on — 
the latter being simply the carrying over to mecdianics of Newton’s 
idea of the optical index of refraction as a function of potential. But 
he did not try to set up a wave equation. And neither, at first, did 
de Broglie, led so much later to precisely similar conceptions. The 
final step of writing out such an equation for matter waves and applying 
it both to large-scale mechanics and to the mechanics of atomic systems 
was left to Schrodinger. 

4. WAVE PACKETS AND GROUP VELOCITY 

The parallelism between the jirinciple of least action and Fermat’s 
principle does not complete the reduction of the laws of Newton’s 
mechanics to a form similar to tliat of th(^ laws of geometrical optics. 
These principles deal solely with the paths of particles and rays, saying 
nothing about the time required to traverse the path. In order to extend 
the discussion to take in the time, we must define the relation between 
the motion of a large-scale body and the corr(\spondiTig matter waves in 
the limiting case wluu’e a sharply defined orbit exists. Following optical 
analogy we postulate that the intensity of the matter waves associated 
with any particle at any space-time point x^y^z^t measures the probability 
that the particle is in the neighborhood of x,y,z at the time t. Then 
if the particle is to have a fairly definite orbit, it must be associated with 
a localized wave disturbance which moves with it over a definite path. 
Such disturbances are familiar to students of ])hysical optics or of other 
types of wave motion. They are usually called ^‘wave packets” because 
they can be analyzed into superpositions of infinite plane monociiromatic 
waves involving a narrow range of wave lengths and directions of wave 
normal, and because this analysis is of fundamental importance in under- 
standing their behavior as time goes on (c/. Chap. II, Secs. 9 to 12). 
Optical wave packets (;an be formed from monochromatic optical wave 
trains if a diaphragm is usc^d to cut them off laterally and a shutter to 
cut them off longitudinally. In a non-dispersive medium where the 
speed is independent of the wave length it is not necessary to start with 
a monochromatic train. Thus Fizeau in measuring the velocity of 
light used wave packets formed by white light passing through apertures 
in the rim of a revolving toothed wheel. To use this method in a medium 
with appreciable dispersion it would be essential to start with approxi- 
mately monochromatic light, since the different colors have different 
speeds. 

The essential requirements to be imposed upon a wave disturbance in 
order that it shall constitute a wave packet are that (a) it shall occupy a 



Sec. 4] 


WAVE PACKETS AND GROUP VELOCITY 


11 


small volume, (6) it shall travel yvlth a definite speed, and (c) it shall 
travel in a definite direcdioii. These requirements are to a certain extent 
mutually contradictory, as may be proved theor(‘tically or demonstrated 
by appropriate experiment. Thus, in the case of an infinite train of plane 
waves the direction of motion is perb-ctly dcTinite, but tlie disturbance 
is not localized at all. Partial localization can be produ(*ed by allowing 
the beam to impinge on an absorbing diaphragm ])rovided with an 
aperture which, for definiteness, we assume to be circular and of radius R. 
This localization, however, is accompanied by diffraction effects which 
mar the sharimess of definition of the wave normal. If the initial beam 
is monochromatic and impinges normally on the diaphragm, and if, in 
addition, the radius R is large compared with tli(‘ wave length, the 
diffraction effects are small and the beam is bounded rather sharply by 
the edge of the geometric shadow of the aperture. The direction of 
motion remains quite well defined. In th(‘ optical case this ph(*nomenon 
is called the ^^rectilinear propagation of light. But if the localization 
of the beam in the plane of the diaphragm is made more complete, by a 
gradual r(iduction in R, the diffraction effects bfHJome more and more 
important until, in the limiting case wIktc R is much smalh^ than the 
wave length, the emergent beam takes the form of a train of hemi- 
spherical waves with no definite direction of motion at all. 

Similarly we may localize the wave train longitudinally if we reduce 
it to finite length in any manner. If the train is monochromatic and 
long enough to include many waves it will trav(d with a fairly definite 
speed even in a dispersive medium. (Sinc.e the spetul of material particles 
varies with their energy, it is clear that the spcif'd of matter waves must 
vary with their frequency as in the (^ase of oj^tical waves in a dispersive 
medium.) This speed of the head and tail of such a finite train is not, 
however, the same as the speed of the wave crests {phatic velocity) unless 
the medium happens to be non-dispersive. Usually the individual waves 
either gain on the head of the train or fall ba(;k toward the tail to fade 
out into nothingness when they reach the boundary of the train. The 
speed of the group of waves as a whole is called the group velocity^ to 
distinguish it from the phase velocity or the speed of the individual 
waves. However, if the train is made so short that it contains only a 
few waves, or perhaps only a fraction of a wave, the group does not hang 
together, but spreads longitudinally as it proceeds as if composed of 
dissimilar elements traveling at different rates of speed. Consequently 
this second kind of localization of a wave disturbance must not be carried 

' Cf. A. Schuster and J. W. Ni< houson, Thaory of Optica, 3d od., p. 3‘i(), licndon, 
1928; T. H. Havelock, The Propagation of Disturbancea in dispersive Media, Cam- 
bridge, 1914. Havelock ascribes the first discussion of group velocity to Hamilton. 
Proc. Roy^ Irish Acad. 1, 267, 341 (1839). 



12 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 

too far if it is not to conflict with the second requirement for a wave 
packet. 

By a suitable compromise, then, we can devise wave disturbances 
appropriate to a dispersive medium which satisfy all three requirements 
for a wave packet. Such a disturbance will necessarily have; a fairly 
well defined wave length and direction of motion, although the Fourier 
analysis would resolve it into a superposition of a continuous spectrum 
of infinite plane wave systems whose wave lengths and wave normals 
are spread out over a very narrow range grouped about the wave length 
and wave normal appropriate to the interior of the packet. We shall 
return to this Fourier analysis in the next chapter. For the pn^sent it 
will suffice to observe that the lack of sharpness in the definitions of the 
position, waVe length, and direction of motion of such a packet is to be 
correlated with a corresponding lack of shari)ness in the position, momen- 
tum, and direction of motion of the associated photon or matter corpuscle. 
The classical velocity and orbit of the corpuscle must then be identified 
with the velocity and orbit of the cemtroid of the wave packet. Thus 
classical mechanics is eventually to be regarded as a limiting case of the 
mechanics of matter wave packets. 

It follows that if the analogy between the principle of least time 
and the principle of least action really means anything it must be possible 
to show that the orbit and orbital velocity of a large-scale particle''^ 
in the Newtonian mechanics are identical with the orbit and orbital 
velocity of a wave packet in a suitably defined wave problem. To make 
use of the analogy of the preceding article we ought to show (a) that 
wave packets travel along the rays of geometric optics, and (5) that the 
speed of the packet is the same as that of the corresponding niechanical 
particle. 

To avoid interrupting the main argument here we shall assume prop- 
osition (a) without proofs for the present and will base our discussion of 
proposition (b) on the familiar formula for the group velocity of a finite 
train of waves in a homogeneous dispersive medium.^ Denoting the 
average, or interior wave length of the group or packet by X and the 
phase velocity by Wy the usual expression for the group velocity Vg is 

- »(S) - 

^ The motion of the center of mass of a system of particles follows the same laws 
as the motion of a single particle both classically and in the quantum mechanics (c/. 
Sec. 15, p. 64). 

* The identity of the paths of wave packets or narrow beams,, of light with the rays 
defined by the normals to the corresponding extended wave systems is commonly 
assumed without proof in textbooks on physical optics. The assumption is validated 
in Sec. 12, Chap. II. 

* Cf, footnote 1, p. 11. 



Sec. 4] 


WAVE PACKETS AND GROUP VELOCITY 


13 


Using the relation between wave length, frequency, and phase velocity we 
readily convert this expression into the more compact form, 



where the differentiation is carried out with v^x^y^z acting as independent 
variables. Proofs of the group-velocity formula for matter waves are 
given in Secs. 9, 10, and 12. 

If the energy of tlu^ particle and the w^ave length are related as in 
Eq. (3*5), and if the group velocity is identified with the speed of the 
particle v, Eqs. (3*5) and (4-1) give 


1 

V 



As p is a function of E^x^y^z and as the spatial coordinates are independent 
of V 

C^dv CdE dv 
But 

dp _ M _ 1. 

dE^ v~ 

hence 


dE ^ __ pvdC 
dv ' C dv 


(4*2) 


The product pv depends upon x, y, and 2, b’at the left-hand member of 
the above equation is independent of these variables. Hence dC/dv 
must vanish. C is a constant, and Eq. (4-2) reduces to^ 


dv 


C = pX. 


(4-3) 


The linear relationship between energy and frcqui^ncy thus derived 
suggests the hypothesis that the Einstein energy-frequency relation (2T) 
holds for matter as well as for radiation. If we assume the validity of 
Eq. (2*1) for matter, Eq. (4*3) requires that Eq. (2-2) shall also apply to 
matter. 

The relation thus obtained between the classical local momentum 
of a particle and the length of the associated matter waves was first 
suggested by de Broglie. ^ It has been confirmed experimentally by the 

1 The proof here given of the linear relationship between E and v is closely related 
to that given by F. D. Murnaghan and K. F. Herzfeld, Proc. Nat, Acad. Sd. 18, 330 
(1927). 

* Loc, cU., footnote 1, page 3. 



14 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


electron diffraction experiments of Davisson and Germer,^ Thomson, ^ 
and Rupp.^ When Eq. (2*2) is applied to bodies of macroscopic dimen- 
sions, the wave lengths obtained are exceedingly small. In the case 
of a golf ball weighing 47 grams and traveling with a speed as low as a 
millimeter in 10 sec., the wave length is 1.4 X 10“*® cm.l This means 
that diffraction effects are hopelessly beyond the reach of experiment 
in the case of large-scale bodies. On the other hand, the computed wave 
length becomes appreciable if Eq.^(2*2) is applied to atomic or molecular 
problems. An oxygen molecule, for example, with a speed corresponding 
to the mean thermal energy of 300®K., has a wave length of approximately 
1.5 X 10“® cm., while an electron with a 10-volt^' kinetic energy has a 
wave length of 5.3 X 10“® cm. As these dimensions are of the order of 
magnitude of atomic diameters and X-ray wave lengths, it is clear that 
diffraction effects must play a prominent part in atomic dynamics. 

It is important to note that by introducing a vector tr having the 
magnitude 1/X and the direction of the wave normal, we can throw 
Eq. (2*2) into the vector form 

p — (4*4) 


We shall call cr the vector wave number. Its components o-*, <r„, <r« denote* 
the number of waves per centimeter crossed by lines parallel to the 
X, y, and z axes respectively. 


6. THE SCHRODINGER WAVE EQUATION FOR A SINGLE PARTICLE 

6a. The Time -free Equation. — In view of the above results we may 
assume that a differential equation of the type of Eq. (3-2) is valid for 
matter waves, the phase velocity w being determined in accordance with 
Eqs. (2*1) and (2-2). Then, 


V" ^ E E 

W — ~ ' y 

P VME - V{x,y,z)] 
or 

K/iv' +V V 

w == = o H 

fjLV 2 ixv 


(51) 

(5*2) 


(The relativistic formulation of the theory given in the next article yields 
a different expression for w.) Denoting the wave function for matter 
waves by we may then write the wave equation in the form 

- V{x,y,z)] ^ , 

> C. Davisson and L. H. Gbrmbb, Phys. Rev. SO, 706 (1927); Proe. Nat. Acad. Set. 
14, 317 (1928). 

* G. P. Thomson, Proc. Roy. 8oc. A117, 600 (1928); AllO, 651 (1928) . 

* E. Ropp, Ann. d. Phytik 86, 981 (1928). 



Sec. 5] 


THE SCHRODINQER WAVE EQUATION 




Ab in the optical case, the differential equation is applicable to mono- 
chromatic, or “mono-energetic” wave functions only. 

This restriction means that all solutions of Kq. (5-3) which have 
physical significance are also to be solutions of the differential equation 
for a harmonic function of t: 


1 

^ dl^ 




■ 


Combining Eqs. (5-3) and (5-4), we obtain 


+ - V(x,y,zm = 0 . 


(5-4) 


(5-5) 


This is the first form of Schr6dinger\s wave equation for a single particle. 
Since the variable parameter E enters into the equation explicitly, Eq. 
(5-5) really includes a whole family of differential equations for each 
type of potential energy function V. We shall at times refer to (5*5) 
as the . time-free wave equation to distinguish it from the equation (5T0) 
of Sec. 56. 

Equation (5*5) has the optical analogue 




4^2 irr^n^ 


(56) 


where n denotes the index of refraction and Xo is the wave length in a 
vacuum. Thus one may say that the essential feature of the Schrodinger 
equation is that it makes the index of refraction for matter waves at 
each point of space proportional to the momentum which the particle 
would have at that point, or to y/2u{E — F). In the case of a particle 
moving under the influence of the earth^s gravitational field, for example, 
the index of refraction will increase as the point under consideration 
approaches the earth \s surface. From the standpoint of the wave 
mechanics the parabolic path of the orbit is to be attributed to the 
bending of the matter waves on account of the resulting inhomogeneity 
of space with respect to their propagation. It is precisely analogous to 
the bending of the sun^s rays as they pass obliquely through the earth^s 
inhomogeneous atmosphere. 

6b. The Second (General) Schrddinger Equation. — Equation (5-5) 
is adequate as it stands for the investigation of the energy levels (fre- 
quency values) in a one-particle atomic problem, but in the case of the 
wave packets of Sec. 4, and in many other cases, non-monochromatic 
wave functions must be used. Hence we need a more general differential 
equation which does not contain the parameter E or its equivalent v. 
Such a differential equation becomes a necessity when we have toNlo 
with problems in which the potential energy depends upon the time 



16 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. 1 


explicitly, or in which for any reason the energy of the system is not 
conserved. A most important example is the perturbation of an atom 
by an external light wave which is the basis of the theory of dispersion. 
Here the assumption that wo have to do with a single monochromatic 
wave function, or a fixed linear combination of such functions, breaks 
down entirely. 

The obvious procedure for deriving a differential equation applicable 
to a wave packet, or to the sum of several monochromatic waves, is to 
eliminate E from Eq. (5-5) by differentiation. To do this we first write 
Eq. (5*5) in the form 

[v - 

r sttvei 

Applying the operator p— to each side of the equation and 

reducing the right-hand member with the aid of Eq. (5*4), we obtain 


[ 


V2 - 


^ 167rVd^ 


(5-7) 


This equation is linear and is valid for any monochromatic solution of 
Eq. (5*5), independent of the value of E. Hence it is satisfied by the 
sum of any finite or uniformly convergent infinite series of monochromatic 
solutions of Eqs. (5*3) and (5*5). We may therefore assume that every 
physically admissible wave function is a solution of Eq. (5*7). The 
converse proposition is not plausible, however, since the above equation 
is of the fourth order and admits of solutions which are not linear com- 
binations of solutions of the family of equations (5*5). Moreover 
Eq. (5*7) is difficult to generalize for use in connection with noncon- 
servative systems.^ 

Fortunately there is a simpler equation than (5*7) which is adequate 
for our needs. Let us assume that the values of the wave function ^ 
are complex numbers, or pairs of real numbers representable as complex 
numbers. Let us further assume that every admissible wave function is 
a linear combination of monochromatic^ functions of tlu^ special type 


2iriEt 

= i'{x,y,z)e * , (5-8) 


where i = (—1)^^ and yl/{x,y,z) is in general complex.^ This represents 


E. ScHBODiNOKR, Ann. d. Physik (4) 81, 109 (1926). 

2 If ^ is expressed in the form 

where A and <p are real, we can resolve 'Y into real and imaginary parts by the formula 

, . r2irEt 1 ... r2irEt 1 

» - A cos 1^-^ - vj - tA sm 1^-^ - ^J. 



Sec. 5] 


THE SCHRODINGER WAVE EQUATION 


17 


a standing wave system if ^ is real and a progressive wave system if tp 
has an appropriate complex form. In the case of a monochromatic 
wave of the type of P]q. (5*8) the exponential time factor in Eq. (5-5) 
may be canceled out, yielding the equation 

(5-9) 

The factor \l/{x^y,z) is sometimes called the amplitude or ‘^spacc 
factor’^ of the wave function. When we have to do with monochromatic 
wave functions, a knowledge of xp is equivalent to a knowledge of the 
complete function for tlie two functions have the same absolute value 
and satisfy the same differential equation. The determination of the \p 
functions for any problem involves the e\^aluation of the corr(\sponding 
energies, so that the complete functions can be set up if desired. Hence 
we shall speak of ^ as a “time-free monochromatic wave function, or 
if no ambiguity is involved, we shall apply to it the simpler t(‘rm “wave 
function.’^ 

We can now eliminate E from Eq. (5*5) by means of the relation 



= - 


A 

2^i dt 


thus obtaining the alternative wave equation 


(5-l()) 

This latter equation is much easier to handler than (5*7) as it is of the 
first order in t and the second order in the space coordinates. We shall 
call it Schrodinger^ s second equation for a single particle. Any linear 
combination of solutions of the family of Eqs. (5*5) having the form 
(5*8) is a solution of (5*10). 

Not all solutions of Eqs. (5*9) and (5*10) are of direct physical interest 
and one of the important problems to be dealt with in Chap. Ill is that of 
defining a suitable class of solutions useful for physical purposes which we 
shall describe as physically admissible. We assume provisionally that all 
such physically admissible solutions of Eq. (5*10) can be expressed as 
linear combinations of solutions of Eq. (5*5) which are of the form 
specified by (5*8).^ 

The justification of the restriction of Eq. (5*8) is that comple.x waves 
of the type which it defines are easy to handle mathematically and 
adequate to the needs of our problem. Our problem is, it will be remem- 

^ Here an integral over a continuum of solutions and the limit of an infinite series 
of discrete solutions are included within the scope of the phrase “linear combination 
of solutions.’’ If 






18 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


bered, merely that of formulating a mathematical description of a type 
of waves which will d(\scribe the facts of classical mechanics in limiting 
cases where diffraction effec^ts are negligible. There is nothing in the 
situation which r(‘(piires ^ to be either a scalar or any particular kind 
of a vector. The wave function for sound waves (density or pressure) 
is a scalar, whereas the waves of the Maxwell electromagnetic theory 

consist of two three-dimensional real vectors 8 and 5C whi(‘li may be 
^united into a single three-dimensional complex vector. In neither case 
can one formulate a single second-order partial differential equation 
like (5T0) which summarizes the properties of eitiu'r monochromatic 
or non-monocliromatic waves in a dispersive medium. Hence the 
complex waves here introduced are mathematically simpler than either 
sound or light waves in a dispersive medium. 

Of course our freedom to use these complex waves is dependent on 
the fact that while the intensity of the waves, measured by has 
din^ot physical meaning (c/. p. 6), 'k itself does not. It follows tliat the 
complex conjugate of any wave function would serve ecpially well to 
describe the same physical situation. We shall indicate the conjugate 
of any complex number by an asterisk. Thus 

2i»tAV 

= i*ix,y,z)e~'‘ . ' (5-11) 

Evidently '!'* satisfies the differential equation 

= 0 . ( 6 - 12 ) 

h- h dt 

The choice of ^ rather than as the wave function is purely a matter of 
convention since the same physical results would be obtained by reversing 
the choice.^ We shall make use of both functions in the development 
of the theory. 

As regards the properties of the wave equation (5*10), we may 
observe here that it is formally similar to the equation for the diffusion 
of heat since it is of the first order in t. Owing to this circumstance 
the complete wave function in any particular case, if analytic in i, is 
determined by the special form of the wave equation and by the instan- 
taneous form of ^ at some initial instant, say < = 0.^ On the other hand, 

1 The opposite convention has been adopted by many writers and was used by 
Prof. E. L. Hill and the author in their articles General Principles of Quantum 
Mechanics," Rev. Mod. Phys. 1, 157 (1929), 2, 1 (1930). The present choice is sanc- 
tioned by convenience and more general use. 

* To prove the above proposition we must show that if any two solutions of 
Eq. (5-10), say and have identical values &tt * 0, their difference is identically 
zero. Let denote the difference function 'i'z — ^ 2 . Then is a solution of 



Sec. 6] APPLICATION OF RESTRICTED RELATIVITY PRINCIPLE 19 


the imaginary coefficient of d^/dt in Eq. (5- 10) gives its solutions an 
undamped wave form quite different from the solutions of the differential 
equation for thermal diffusion. 

*6. THE APPLICATION OF THE RESTRICTED RELATIVITY PRINCIPLE 

Historically the formulation of the Schrodinger wave equations was 
antedated by de Broglie^s application of the restricted relativity principle 
to the problem of the correlation of waves and free particles.^ His 
argument will be reviewed here, since it leads to a very different expres- 
sion for the phase velocity from that given in Eq. (5-2). 

Symmetry demands that a stationary particle be associated with 
a stationary rather than a progressive wave system. Hence wb may 
postulate the form 

^ ( 0 - 1 ) 

for the wave function of a particle referred to a system of coordinates 
XojijojZo with respect to which it is at rest. To get the corresponding 
wave form for a fre(' particle moving in the direction of the z axis with a 
speed Vy do Broglie applies the Lorentz transformation 



which yields 

vlf = f(^x, ] ( 6 . 2 ) 

This expression may be made to describe an infinite plane wave system 
or a localized disturbance according to the hyi)othesis regarding the 
space factor fixo^yo^ZQ). In either case the frequency defined by the 
phase factor is 

Vo _ 

Vi 


(5-12) which vanishes identically in y, 2 at e * 0. Thus 



Differentiating both sides of (5- 10) with respect to inverting the order of differentia- 
tion and setting ( = 0, we see that must also vanish. In the same way we 

can prove that all the time derivatives of ^ vanish at ^ ~ 0. Thus all terms in the 
expansion of ^ in powers of t vanish for all values of x, z and hence ^ must be 
identically equal to zero if it is analytic in t (c/. Sec. 326). 

^Loc. ct/., footnote 1, p. 3. 



20 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. 1 


If the zero level of energy is fixed in accordance with the usual relativity 
expression 


Er = 


MoC^ 

\/l — 


(6*3) 


Er and v transform in the same way so that the fundamental relation (2-1) 
is unaffected by the change of coordinate systems. This fact was the 
first great contribution of de Broglie. The wave length for a free 
particle required by Eq. (6*2) is 


^ \/i — v^/c'^ _ c^h _ h 

V Vo ErV flV 

in agreement with Eq. (2-2). The phase velocity on the other hand is 



The discrei)ancy betwe^en this expression for the phase velocity arid the 
value v/2 derived for a free particle from the nonrelativistic point of view 
[c/. Eq. (5*2)] suggests that absolute phase velocity is without pliysical 
significance. As a matter of fact there is no way in which it can be 
measured experimentally. 

If the particle is not free, but moves under the influence of a force 
field with potential energy V{x^y,z) we can still take into account the 
relativistic variation of mass with speed. The principle of least action in 
the form of Eq. (3 ’4) is still valid (c/. Appendix A), although the expres- 
sion for the momentum in terms of the potential and total energies has 
the more complicated form 

p(Er,x,y,z) = IVlEr - vr - (6-5) 

c 


A comparison of this principle with Fermat’s principle shows that the 
mechanical orbits and the rays of the wave problem agree if C/\ = p, 
where C is a constant as in Eq. (3-5). As before we find that 


dp _ M _ J_ 

^ ~ p ~ v‘ 


(6-6) 


It follows from Eq. (4-1) that the group velocity of the waves is equal to 
the speed of the particle if we identify C with dEr/dv or h. 

The monochromatic wave equation corresponding to the above 
expression for p is 



Sec. 7] WAVE EQUATION FOR SYSTEM OF MANY PARTICLES 21 

Introducing the ordinary energy E = Er — noc^ we readily reduce (6-7) 
to the form* 

V2VI, + - V)^ = -~~{E - Vyw. (6-8) 

The right-hand member here appears a correction term to (5*5) which is 
usually small. We shall use this equation for the relativistic treatment 
of hydrogenic atoms in Chap. XIII. It is to be observed, however, 
that the assumption that the forces are derivable from a potential energy 
function is not valid in relativistic dynamics except in certain special 
cases and then only for a single frame of reference. ^ Hern^e a thorough- 
going relativistic wave equation cannot be formulated on the basis of 
the action function given by (6*5). Moreover, the fact that the energy E 
enters nonlinearly into (6*8) makes a difficulty which ultimately spoils 
the possibility of basing a satisfactory quantum mechanics on (6*8). 
For this reason we shall not take the space here to discuss the extension 
of the wave equation (6*7) to the non-monochroma tic case. 

7. THE WAVE EQUATION FOR A SYSTEM OF MANY PARTICLES 

7a. Formulation of Equation. — Equations (5*5) and (5T0) can be 
generalized without difficulty to include the case of a system of n particles 
moving undc'r thei influence of conservative forc(\s. The generalization, 
however, involves one important break with optical analogy. In optics 
a wave function spread out in ordinary three-dimemsional space (^an 
describe the statistical behavior of any number of coexistent photons. 
The photons apparently exert no influence u]>on each other so that the 
form of an interference pattern is independent of the intensity of the 
light used. We infer that the same wave function may be used to 
describe the behavior of one photon or ten thousand. In the case of 
matter corpuscles, however, we must consider the ^Torces'' which they 
exert on each other. In a system of n particles the motion of each 
depends on the coordinates of all,^ since the potential energy V is a 
function of the coordinates of all. 

Thus the wave function ^ which is to describe the behavior of a 
system of n particles must depend on the 3n coordinates of the system 
and the time. We may say, if we like, that it is spread out over a 
3n-dimensional * ^coordinated’ or configuration” space in which each 
point represents a possible configuration of the system as a whole. 
But the use of such geometrical language is not essential and means 

^ Cf. O. Klein, Zeitsf. Physik 37, 895 (1926); W. Gobdon, Zeits, /. Physik 40, 117 
(1926). 

* C/. W. Pauli, Jr,, ‘‘Relativitatstheorie,” Encyklopaedie der Maihematiachm 
Wiasenschaften, XIX, p, 678. 

* Classically, and hence also in the quantum mechanics. 



22 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


merely that, since ^ depends on 3n independent variables, it could 
be laid out as a point’' function only in a space having the corresponding 
number of dimensions. In practice the wave function for any particular 
case is always derived and applied by purely analytical methods inde- 
pendent of the concept of configuration space. 

The generalization of Eqs. (5*5) and (5T0) for a system of n particles 
can ):)e derived with the aid of the appropriate form of the principle of 
least action and a suitable extension of Fermat’s principle to 3n-dimen- 
sional waves. ^ Inspection of the three-dimensional equations suffices, 
however, to suggest the correct g(‘neralization. Let Eq. (5*5) be written 
first in the form 


\M 


+ 


1 ^ 
dy' 


+ 


fidzy 


+ ^\E - V{x,y,z)]^ = 0. 


(71) 


It will be observed that each coordinate enters the equation through 
the potential energy function V and also through the corresponding 

1 ^2 1 ^2 I 02 

operator - or - — as the case may be. We assume that the 

y ox^ jjL oy^ jjL az^ 

classical potential energy function for the problem in hand is known and 
use it also in the wave equation. The obvious procedure for generalizing 
Eq. (7-1) is then to add a corresponding differential operator for each 
added coordinate. Let the ordinary three-dimensional Cartesian 
coordinates of the particles be labeled as a single series a?i, a; 2 , * • • , xsn, 
and let the corresponding masses be /ii, iU 2 , * • * , Man, where, of course, 
Mie three masses of any one particle are the same. Then the expanded 
equations, analogous to (o*o) and (5-10), are 


(7*2) 


(7-3) 


As noted above, these equations may be justified in the limiting case 
where diffraction effects are negligible by showing that they are in har- 
mony Avith the principle of least action. An alternative procedure is 
to use Eq. (7-3) to derive Hamilton’s canonical classical equations of 

? otion for sharply defined wave packets (cf. Chap. VIII, Sec. 39c). 

he final, general justification of Eqs. (7-2) and (7-3) must obviously 
come through the agreement between results derived from them and the 
facts of experiment in the domain cf atomic and molecular physics 



1 Cf. E. C. Kemble, Rev. Mol, Phij^. 1, 166 (1929). 




Sec. 7j WAVE EQUATION FOR SYSTEM OF MANY F ARTICLES 23 


Implicit in the above exteicsion of the wave equations is the assump- 
tion that the total energy of the system of particles and the frequency 
of the associated wave system are related by the rule (21), 

E = hv. 

By means of this rule we can pass from Eq. (7-2) to Fa\, (7-3) or from 
Eq. (7-3) to Eq. (7-2). Thus, starting from Eq. (7*3), let us seek a 
monochromatic solution of the form 

^ X2y • • • 

Substitution of this expression yields 



With the aid of (2T) this reduces to Eq. (7*2). 


7b. Relation of Schrodinger Equation to the Classical Hamiltonian 
Function. — Equation (7-2) is formally relatinl to the classical Hamilton- 
ian function for the system under consideration and can be deduced from 
it by the following rule of thumb. First set up the energy equation 
using the classical Hamiltonian function in C'arti^sian coordinates:^ 


3n 


^ The condition for tlie existence of a classical Hamiltonian function is that the 
equations of motion of tlij system under consideration are reducible to the Lagrangian 
form 


dL 


(U\aqk) ’ 


fc = 1,2, • ■ - 


Iqf. Eqs. (^9), Appendix A] where L is a suitably defined function of th(‘ generalized 
coordinates gi, 9 ;j, • • , <//, tiieir velocities, and the time. If the kinetic energy T 

is a homogeneous quadratic, function of the velocities as in the Newtonian iionrela- 
tivistic dynamics, and if the forces acting on the system are derivable from a potential 
function V{qi, • • • ,9/, t) which does not depend on the velocities, such a reduction 
of the equations of motion is possible with 

L - T ~ F. 


In more general cases, e.g.y when the relativistic variation of mass is taken into 
account, or when the particles are acted on by forces of magnetic origin depending 
directly on the velocities, it is necessary to invent an appropriate Lagrangian function 
L, 

When the equations have been thrown into Lagrangian form, the momenta conju- 
gate to the coordinates 3i, • • • » 9/ are defined by the equations 

* = 1 . 2 . •••./ 

and the Hamiltonian function ff(pyqyt) is derived from the function Xkpi^k Liq^q^t) 
by eliminating the velocities with the aid of the corresponding momenta. 

In the Newtonian theory of a conservative system where L ~ T - F, the momen- 



24 INTRODUCTION TO DUAUHTW THEORY OF MATTER [Chap. I 


Second convert each side of Eq. (7-4) into an operator^ by the substitution 

k = l,2,---,Sn 

V operation of multiplying by V, 

E operation of multiplying by E. 

Finally, let each member of the operator equation act on 

^(xi, • • • , x^in, 0 - 


The resulting differential equation is equivalent to (7-2) and is frequently 
expressed in the operator form 

where H(d/dXki Xk), or II(d/dx^ x)^ stands for the operator obtained from 
the classical Hamiltonian by the above substitution. In case no ambi- 
guity is involved, this operator, or any modification of it which plays 
the same role in the theory, is designated by the simple symbol H. 

To set up Eq. (7*3) one proceeds as above, except that one replaces E 

by the operator 

*7c. The Schrodinger Equation and the Hamilton-Jacobi Equation. — 

These substitutions form an obvious parallel to those which are made in 
setting up the Hamilton-Jacobi partial differential equation of classical 
dynamics. 2 The latter equation has two forms similar to (7-2) and 
(7-3) respectively. In case the Hamiltonian function H does npt depend 
explicitly on the time, it remains constant during any natural motion 
of the system and is identified with the energy of the system. The first 
form of the Hamilton-Jacobi equation is obtained from the energy 
equation = E by means of the substitution 


Pk 



fc = 1, 2, 


,/ 


turn Pk reduces to dT/dqkt and the Hamiltonian function is equal to the total energy E. 
In the case of a system of n particles with Cartesian coordinates, pk becomes fMkPbk 
and the Hamiltonian function reduces to the simple form (7*4). 

Whatever the special forms of L and H may be, the / second-order Lagrangian 
equations given above are equivalent to the first-order canonical equations of 
Hamilton 

dqk __ dH, dpk _ dH 
dt dpk dt dqk 

^ An operator is a rule for the transformation of one function into another. The 
transformed function, or ** transform,” can have the same arguments as the original 
function, or a different set. 

* C/. any standard text in advanced analytical dynamics. 



Sec. 7] WAVE EQUATION FOR SYSTEM OF MANY PARTICLES 


25 


Any system of generalized coordinates (/i, 72, * • • , <7/ is permissible. 
A solution of the resulting inhomogeneous first-order partial differential 
equation 

dS dS \ 

^ ‘ ’ dq/ ‘ ‘ ^ ^7-6a) 

is sometimes called an ^^action function/’ although it is not to be identified 
with the action integral of the principle of least action (c/. footnote 
p. 44). 

The second form of the Hamilton-Jacobi equation is applicable even 
when H involves t explicitly and the energy is not conserved. It is 
obtained from H(p,q,t) by replacing p,, by dA/dqk for all values of k and 
equating the resulting expression to —dA/dt. Thus 




dA 
’ dq/ 


<Ih 


, (If, ^ 


dA 
' at' 


(7-66) 


The parallelism between this classical equation and the wave equation 
d h d 




2Tri dXi 


’ 2^1 dXf 


7 Xi, • • • 




h dj^ 
2Tri dt 


(7-3a) 


is striking. We have not followed Schrodinger^ in deriving the general 
wave equation (7-3a) directly from (7*66) but the reader will find a 
discussion of the Intimate relation between these equations in Chap. II, 
Secs. 11, 12 A The apparently formal parallelism between the general 
wave equation and the second form of the Hamilton-Jacobi equation 
actually gives a basis for proving the asymptotic agreement betwe^en the 
classical mechanics and quantum mechanics in cases which lie outside 
the range of the principle of least action, e.g., in th(^ case of nonconserva- 
tive systems where the forces depend on the velocities or are explicit 
functions of the time. Furthermore it n^solves the apparent ambiguity 
in the sign of the exponent in Kq. (5*8). If this sign is negative, we must 

h 3 

use the substitution E "”0“" hi setting up (7*3) or (7*3a), whereas 

oi 


a positive sign would correspond to the substitution E 


h d 
27ri dt 


If the 


Hamiltonian operator <7^ Iwo choices are equivalent 

as indicated by Eq. (5*12). However, in the more general case of a 
complex Hamiltonian operator, such as the one to be introduced in 
Eq. (7-8), they are not equivalent. In order to maintain the parallelism 
between (7*3a) and (7*66) in such cases it is necessary to use the sub- 


' Cf. footnote 3, p. 3. 

* Cf. also G. D. Bibkhob’F, Proc. Nat. Acad. SH. 19, 339 (1933). 



26 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


h d h S 

stitution along with P/k — > reverse signs 

2x^ at 2 ti dQk 

throughout.^ 

From one point of view the sign of the operator substituted for E 
is somewhat puzzling on first examination. It is well known that the 
function S of p]q. (7*6a) is the generating function of a contact trans- 
formation which replaces the variables * • * , O'/ by a new set one 
of which is the time, or differs from it by a constant. The momentum 
conjugate to t in this new set of coordinates is E. Since we substitute 
h d 

+2^. ^ for the momentum conjugate to Xk, one might argue that to be 

consistent we ought to substitute for E. The argument is 

Ztz at 

fallacious, however, for the classical Hamiltonian from which we form 
the Hamiltonian operator in Eq. (7-3a) has not been subjected to the 
above-mentioned transformation. E is not conjugate to t when the 
independent variables are Qh • ' ' y q/ and we cannot make substitutions 
corresponding to two different sets of independent variables in the same 
equation. 

It is possible, however, to get a satisfactory classical analogue for 
the substitutions used in setting up Eq. (7-3a) without using the Hamil- 
ton- Jacobi equation. To do so one makes use of a classical scheme for 
treating the time on the same formal basis as the spatial coordinates.^ 
In this scheme a parameter r is introduced as the independent variable 
and the number of independent coordinates for an /-dimensional system 
is stepped up to / + 1 by adding the time to the ordinary coordinates 
Qiy • • • j Qf. 2/ + 2 Hamiltonian equations are then set up with — .B 
playing the role of momentum conjugate to the coordinate t, 

*7d. The Wave Equation for a System of Charged Particles in a Clas- 
sical External Electromagnetic Field. — For use in connection with the 
study of the Zeeman and Stark effects and for a discussion of the absorp- 

^ It is much easier to transform the Hamiltonian operator and the first-order 
Hamilton-Jacobi equation from one coordinate system to another than to make the 
corresponding transformation of the second 7 order wave equation (7 So). Hence 
there is an obvious temptation to set up the wave equation in a generalized coordinate 
system by transforming the classical Hamiltonian and subsequently making the 
substitution 


E 

2rridqk 2iri 

This is not permissible in general, however, and it is necessary to set up the wave 
equation in Cartesian coordinates first and apply the transformation afterward, or 
to make use of some ipethod definitely proved to be equivalent (c/. VII, Sec. 356). 

* See the article on Hamilton-Jacobi theory by L. Nordheim and E. Fues, in 
Geiger and Scheers Handbuch der Physik Band V, Kap. 3, Ziff. 4, Berlin, 1927. 



Sbc. 7] WAVE EQUATION FOR SYSTEM OF MANY PARTICLES 


27 


tion of radiation we shall need a wave equation applicable to a system 
of charged particles in a classical electromagnetic field. Although the 
derivation of this equation is somewhat technical for an introductory 
chapter, we insert it here to avoid repetition in later chapters. 

As we have introduced the subject of wave mechanics by a considera- 
tion of the corpuscular theory of light, t.c., with a preliminary study of the 
quantum properties of the electromagnetic field, the reader will be 
inclined to raise his eyebrows at the attempt to combine a quantum 
theory of the atom with a classical picture of an interacting electro- 
magnetic field. Our excuse for the construction of siudi a hybrid theory 
lies partly in the extreme difliculty of formulating a satisfactory thorough- 
going quantum theory of the interaction between matter and radiation 
and partly in the observation that in the limiting case of very long wave 
lengths — static or quasi-static fields — the corpus(nilar properties of the 
electromagnetic field recede into the background while the classical 
properties dominate. Hence we can reasonably hope that such a classical 
treatment of the field will be in asymptotic agreement with experiment 
as the wave lengths under consideration become very large. As a matter 
of fact the absorption formulas which the theory yields have proved 
satisfactory over a very wide range of the spectrum. 

Following the method sketched above, we shall begin by constructing 
the appropriate classical Hamiltonian function in Cartesian coordinates, 
converting it into an operator as before and using the operator to form 
an equation of the form (7-3a). If the external field varies with tlie time, 
the Hamiltonian will involve t explicitly and we have to do with a case in 
which the energy is not conserved. 

Consider a system of n charged particles moving in an external 

classical electromagnetic field with the scalar potential y, z, t) and 
— ♦ 

the vector potential (l{Xy j/, 2 , t). Let denote the 

components of the vector potential at the point X/, 2/y, where the 
jth particle is located. Let /xy and e, denote respectively the mass and 
the algebraic value of the charge of that particle. The classical 
Hamiltonian function then takes the form* 

y-1 

n 

+ • • • , 2n) + 

Here V denotes as usual the internal potential energy of the system which, 
' C/. Van Vlbck, The Theory of Electric and Magnetic SusceptihiUtieSy Chap. I, p. 7, 
Oxford, i932; J. Frenkel, Lehrhuch der Elektrodynandky Vol. I, pp, 330-331, Berlin, 
1926. 



28 INTEUOVCTION TO DVALISTIC THEORY OF MATTER [Chap. I 


in most cases, is computed by the electrostatic method. ^ includes only 
the external part of the total scalar potential. Making the substitution 
h d 

etc., we obtain the desired Hamiltonian operator 

2Tn dxj 



i=] y«i 


(7*8) 


Inserting this exi)ression into (7*3a), we deduce the generalized Schrod* 
inger equation 


2 ±\ (A ± _ aX + (A. L _ fia (,)Y 4- 

2n,\ V2« dXi c ^ J ^ \2W dyi c ” ) ^ 

Qh I, - ?«■"’)’]* + ’'♦ + - A w 


If we make the customary assumption that (X and ^ conform to the 
special condition 


div a + i ^ = 0, 

C dt ' 


(710) 


Eq. (7*9) takes the form 


n 

« r i 27^^Cy - 1 87r2T7 / t l 

_|v^» - -ir«“ • '''* + -p^* + X «? - “■ 

(711) 


; «l 
Here 


F' = F + 




y=i 

(|(7) . s (Xxixj^y 4* G,y(xjjyj^Zj)-^ + (XziXjyyjyZj) • 


'dZj 


In the case of a system of plane waves or a pure radiation field resolv- 
able into a superposition of plane waves, it is always possible to choose 

a vector potential d with zero divergence and to set ^ equal to zero. 

2 g.2 -* 

is usually so small in practice that 



Sec. 8] PHYSICAL INTERPRETATION OF THE WAVE FUNCTION 29 

it can be safely neglected. Thus the wave equation for an atomic 
system interacting with a pure radiation field reduces to 


n 



he 


ftO) . 


- 7 ^ = 


2jr« dt 


( 7 - 12 ) 


In consequence of the basic relations 


3C = curl a, 


1 K 1 

= - grad^t - — » 

c dt 


whi(;h give the magnetic and (ih^ctric vectors X and 8 in terms of the potentials a 
and <l>, the field vectors arci invariant with respi'ct to a substitution of the form 

a ---> Ct + grad U(x,y,z,t). 4» ^ i ~ ( 7 . 13 ) 

Heru^e the physical results obtained from the wave equation (7*9) should be unchanged 
by the same stibstitution. This kind of invariance is called gauge invariance. As a 
matter of fact, if we apply the transformation (7- 13) to the potentials and write 




• • • , 2n, t) exp 


where 'ko is a solution of (7-9), we find that 'k is a solution of the transformed equation.^ 
The phase factor (ixp does not affect the value of or the vector 

mtiss current density I whos(* form for the three-dimensional (^aso is worked out in 
Sec, 8. We conclude that any legitimate alteration of the potentials produces a 
modification of the phase' of th(^ wave function which does not alter its physical impli- 
cations. It will b(; observed that if the specialized wave equation (7*11) is used, we 
can make the transformation (713) only if 




8. THE PHYSICAL INTERPRETATION OF THE WAVE FUNCTION AND THE 
NORMALIZATION CONDITION 

8a. Probability and Quadratic Integrability. — In concluding this 
chapter we return to the question of the physical significance of the ^ 


^ Cf, V. Fock, Zeits. f. Physik 39 , 226 (1926); W. Pauli, Jr., *‘Die allgemeinen 
Prinzipien der Welleiirnechanik,'' Handbuch der Physik, XXIV ^ Part 1, 2d ed., pp. 110- 
111, Berlin, 1933. 

The point is easily verified by noting that 


[isi -?(»■“ 


Sir = exp 
sy =s exp 





30 


INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


waves. In the case of three-dimensional waves we have already laid 
down the postulate that the square of the absolute value of ^ at any 
space-time point x^y^z^t is to measure the probability that the associated 
particle is in the neighborhood of the point x^y^z at the time t. Let dF 
denote the probability that the particle is in the volume e\emer\t dxdydz. 
To make the above hypothesis more explicit we assume that 

dF = K\^\Hxdydz = K'^'^^dxdydZj (8T) 

where iiC is a constant or a function of the time. Integration over all 
space? yiedds the probability that the particle is somewhere, which must 
be unity. Hence, 

K fff. '^'^^dxdydz == 1. 

If the integral ///. ’^dxdydz converges to a finite value it is equal to 

the reciprocal of K, ^ is then said to be quadratically integrahle. Many 
solutions of the wave equation do not satisfy this condition and for them 
the hypothesis (8T) is inapplicable since K must vanish. Solutions of 
the wave equation which are not quadratically integrable are often of 
great mathematical and physical interest, but they play a minor role 
in the theory as a whole. 

These solutions never correspond exactly to actual experimental 
physi(;al situations. Consider, for example, the case in which ^ is not 
quadratically integrable because it does not vanish rapidly enough at 
infinity. Then an associated particle is sure to lie outside any sphere 
which is drawn about the origin of the coordinate system. But in 
practice the apparatus and objects studied in a physical experiment 
must lie in a bounded portion of space, so that such a function does not 
fully represent a practical experimental condition. may also fail of 
quadratic integrability because it becomes too rapidly infinite at some 
finite point, say P. This occurs only when the point under consideration 
is a center of force such as an atomic nucleus and the ^ function would 
then represent a situation in which the particle is sure to lie inside any 
sphere, however small, which is drawn about P as a center. In other 
words the particle has condensed on P, and either it doesnT exist as a 
separate entity or we have to do with a problem in nuclear physics with 
which our wave equation is not designed to deal. 

Quadratically integrable wave functions can usually be normalized in 
such a way as to eliminate the constant K of Eq. (8-1). If integrates 
to a constant value it is only necessary to form the new wave function 



which will also be a solution of the wave equation. Then represents 



Sec. 8] PHYSICAL INTERPRETATION OF THE WAVE FUNCTION 31 


the same physical situation as ^ and in addition satisfies the normaliza* 
tion condition 


///. '^i^i^dxdydz == 1 . 


(8-2) 


The constancy of ///. '^^'^dxdydz in time is an obvious corollary 


on the existence of this integral in the case of monochromatic wave func- 
tions. In the more general case it can be proved with the aid of suitable 
mild restrictions on the behavior of ^ at infinity. 

8b. Normalization and Mass Current Density. — Consider the integral 
of over a volume G bounded by a large sphere A of radius R and by 
small spheres <Si, /S 2 , * * * excluding the points at which the differential 
equation breaks down due to the fact that V is infinite. (We assume that 
V is continuous except for isolated poles.) Differentiation yields 


iJi = J J + '^*^^)dxdydz. (8-3) 

Let ^ be a solution of Eq. (7*9) for the case of a single particle of chargee. 
Using the time derivatives of ^ and taken from this equation, we 
deduce the relation 


d 

m 


J/X 


'i''i'*dxdydz = 



- '<f*H^)dxdydz 



A-vip 

-h • grad > 1 ^) + div Gi^^dxdydz, 


(8-4) 


With the aid of Gauss\s divergence theorem the volume integral on the 
right side of the above equation can be replaced by a surface integral. 

Thus, if S denotes the aggregate of the surfaces A, Si, 182 , * * * , and 1 
denotes the complex vector 




4-‘jr?V — *' 

grad 'I' — ^ grad 


(8-5) 


we obtain the^equation of continuity div I 
relation 


a 

at 


JXX jjL^^'^dzdydz = 


= — and the integrated 

ot 

-jjjndS. (8-6) 


Clearly, as the integral of over any region represents the statistical 



32 INTRODUCTION TO DUALISTIC THEORY OF MATTER [Chap. I 


average mass in that region, I must represent a statistical mass current 
density. 

Since experience shows that matter is not created at a simple center of 
force we must either prove that quadratically integrable ^ functions 
satisfy the condition that 

lirn J* = 0, (p^ = radius of Sk) (8*7) 

or else we must impose the condition (8-7) or equivalent additional 
restrictions on i)hysically admissible wave functions. 

Similarly we must rule out wave functions involving an inward or 
outward flow of matter from infinity. This may be done by requiring 
that 

\\m[rn{x,y,z)] = 0. (8*8) 

r —* 00 

In this case, as in the preceding one, the restriction is a mild one which is 
automatically fulfilled by monochromatic wave functions. 

The interpretation of the vector I as a current density, while based on 
the interpretation of \^\Hxdydz as probability density, is nevertheless 
transferable to 'i' functions which are not quadratically integrable. 
Mathematical simplicity in the discussion of long steady streams of 
^ independent particles is frequently obtained by treating them as if they 
were infinitely long and therefore contained altogether an infinite number 
of particles. ^ functions which are not quadratically integrable are 
obviously adapted to the description of such streams, which can be 
regarded as the limits of finite streams as the volume over which they 
are steady becomes infinite. In using ^ functions for this purpose we 
omit all normalization and use \^\Hxdydz as the relative probability 

of the volume element dxdydz and the vector I as the relative curreni 
density. 

In the case of the problem of n particles the wave functions have been 
shown to be spread over a 3n-dimensional coordinate space. Let dr 
denote the volume element dxi • • • dzn in this space. If the integral 

J ^^^dr converges, we say that ^ is quadratically integrable. The 
conditions which must be imposed on ^ in order to insure the constancy 
of J in time are similar to those which we have derived in the 

three-dimensional case. (These conditions will receive further con* 
sideration in Secs. 17 and 32d.) Here we assume that these conditions 



Sec. 8] PHYSICAL INTERPRETATION OF THE WAVE FUNCTION 33 


are fulfilled for all physically adnnssihlc wave functions. Such wave 
functions can then be normalized in accordance with the condition 

= 1. (8-9) 

It will usually he assumed that this j)roeess of normalization has actually 
been carried out. By an o})vious extension of our hypothesis regarding 
the physical intt^rpretation of three-dimensional ^ waves we assume in 
this more general case that is the probability that the system 

represented by ^ has a c.onfiguration lying in the range defined by dr. 

8c. A System Consisting of Two Independent Parts. — It will be 
instructive to test the validity of the above assumption by the considera- 
tion of a sj)ecial case in which the system (*onsists of two independent 
parts. Considered separately, tlie two i)arts have wave functions 
and ’^'•2 which satisfy the Schrodinger equations 


On the other hand, considered as a single system, they should have a 
wave function ^ which satisfies the equation 


mr = -- 


h ^ 
2wi dt 


( 8 - 12 ) 


Here the Hamiltonian operator 11 for the united system is equal to the 
sum of the Hamiltonian op(Tators //i, Ih of the parts. Since the prob- 
ability of the simultaneous occurrence of two independent events is 
equal to the product of the probabilities of the individual events, the 
assumed physical interpretation of |^'|- demands that 


The volume element dr in the configuration space of the united system is 
equal to the producit of corresponding volume elements dri and dr^ in 
the separate configuration spaces, so that This 

relation is satisfied if ^ and substitution in (8T2) shows that 

^ 1^2 is actually a solution of the Schrodinger equation for the united 
system. We conclude that the assumed physical interpretation of 
1^1 2 is in satisfactory agreement with the structure of the Schrodinger 
equation. 

Incidentally, if the functions and ^2 have the monochromatic form 
'i'l = yi, zi)e * , ys, 2 s)e * , 



34 INTRODUCTION TO DUALJSTJC THEORY OF MATTER [Chap. I 


the product function will also be monochromatic and its time factor will 
be 



h 




Thus the total energy of the system as defined by its vibration frequency 
is equal to the sum of the energies of its parts. Although this energy 
relationship is correct classically, it is somewhat disc^oricerting when 
interpreted as a frequency relation. To say that the frequency of the 
system treated as a whole is the sum of the frequencies of its parts is 
evidently to vsay that the waves have no objective reality, but are mere 
mathematical tools for predicting the behavior of the associated particles. 
The same view is supported by the discussion at the end of Sec. 7 , which 
shows that the form of ^ in any given case depends on the particular 
choice of the vector potential used in setting up the wave equations. 

The author is indebted to Prof. Frenkel for the suggestion that 
the fundamental reason for the use of complex ^ funcjtions is that this 
is the only way of satisfying both the rule for combining independent 
probabiliti(‘S and the law of addition for energy. Thus, if we assume that 
\|f = and give 'Pi and ^2 the real forms 



we find that ^ is not monochromatic but involves the two frequencies 
{El + E2)/h and \Ei - E2\/h, 



CHAPTER II 


WAVE PACKETS AND THE RELATION BETWEEN CLASSICAL 
MECHANICS AND WAVE MECHANICS 


9. WAVE PACKETS AND GROUP VELOCITY IN A ONE-DIMENSIONAL 
HOMOGENEOUS MEDIUM 


9a. The Fourier Integral Theorem. — In this chapter we shall develop 
the theory of wave packets in order to exhibit more fully the asymptotic 
agreement betwecm classical mechanics and wave mechanics which was 
assumed in setting up the Schrddinger equation in Chap. 1. The theory 
of wave packets will then be used to formulate a more complete statistical 
interpretation of the ^ waves. 

We begin with a consideration of the motion of free particles in one 
dimension. The de Broglie waves associated with such particles move 
with a uniform speed independent of the coordinate x, like light waves 
in a homogeneous medium. Since the potential energy is constant, 
we may set it equal to zero without loss of generality. The Schrodinger 
equation (5T0) then reduces to the form 


"h~Ti 


(9*1) 


A particular solution of this equation in the form of a monochromatic 
progressive wave is obtained by assuming that ^ has the form 

^ ( 0 . 2 ) 


This expression satisfies (9T) provided that the constants <t and v are 
related by the formula 


h> 2/i 


(9-3) 


If we interpret the frequency v and the wave number <r in accordance with 
Eqs. (2T) and (2-2), we see that the above condition is equivalent to 
the classical energy formiila 



A more general solution of (9T) is obtained by taking a linear com- 
bination of such particular solutions. No such combination of discrete 
solutions of the above type is quadratically integrable, however, and to 

35 



36 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


obtain a quadratically integrable wave function we have recourse to an 
integral over a one-parameter family of functions of the type (9*2). 
For this purpose we make use of the Fourier integral theorem which is 
best stated for our purpose in the following form. 

Let f{t) denote a complex function of t whose real and imaginary parts 
satisfy the Dirichlet condition^ in every finite interval and which yields a 

/ --{•- 00 

\f{t)\dt. Then the integral 
g(x) = j*_^Jf(t)e^^'^^^dt 

exists and implies that 

f{t) = ^ g{x)(^^^^^Hlx. 

g(x) is called th(' Fourier transform of /(O- 

00 

^ f{t)tHt also exists, the integral representing g{x) can be differen- 
tiated under the integral sign.^ If f{i) is (piadratically integrable 
g{x) is also quadratically integrable. In fact,* 


= f^“\g{x)\Hx. 

^ The Dirichlet condition is said to be satisfied by a function f{t) in the interval 
a < t < by provided that the inb^rval can be split into a finite number of partial 
intervals in each of which th(^ fiinction is continuous and monotone. For complete 
rigor, f(t) must be so defined at points of discontinuity that 


m =-- }^i/« + o) -0)]. 

2 Cf. S. Bochner, Vorlesungen uher Fourier sche Integralcy p. 92, Satz 35, Leipzig, 
1932. 

® The theorem is known to mathematicians as Plaii(;h()reV8 thejorem [cf. M. Planch- 
erel, Rend, di Palermo 30 , 289 (1910); E, Titchmarsh, Proc. London Math. Soc. (2) 
23 , 279 (1923)1, although the relation was first derived by the (;ld(ir Lord Rayleigh on 
the assumption that th(^ integrals under consideration are convergent [Phil, Mag. (5) 
27 , 466 (1889)]. It follows from the work of Planchertd that in the proof of this 
theorem we may dispense with the requirement that }{t) is absolutely integrable if 
we define the Fourier transform g{x) by 




A more general form of the theorem ca^ be derived by applying it to a linear 
combination /i -f a /2 of two quadratically integrable functions /i and f^ with the 
Fourier transforms g\ and ^ 2 , respectively. It follows that for all complex values of a, 


H^oe, 


a /ji/i*di + a* fif2*dt = a g^gi^dx -f a* gig2*dx, 

CO flO J- CO •/- «0 



Sec. 9] 


WAVE PACKETS AND GROUP VEWCITY 


37 


We accordingly dnfino by the equation 

'i'iXjt) = (9-4) 

where G{c) is assumed to satisfy the conditions imposed on fit). Dif- 
ferentiation under the integral sign shows that 4'(a:,0 is a solution of 
(9-1) if 0 - and v are related by (9-3). If w(! define 'l>(<rX) by 

it follows from the Fourier integral theorem that 

(9-6) 

Moreover, it follows from Phiiieherel’s theorem^ that can be nor- 
malized by normalizing G. Thus 

J'^J'^'i>*dx = 

Conversely, if satisfies the conditions imposed on G((r), and 

if ^(Xjt) is a solution of Eq. (9T), it follows that is expressible 

in the form (9*4) and that Eqs. (9*5) and (9*6) are valid.- 

9b. Derivation of Group-velocity Formula. — Let us next consider a 
special case in which G(<r) is a function having a maximum at the point 
(T = (To and approaching zero inonotonically as \a — (ro| increases. We 
shall suppose that G is sc^nsibly equal to zero outside of a certain small 
interval M containing the point er = ao. The wave function ^ defined 
by Eq. (9-4) can then be described as a ‘Svave packet^^ since it is com- 
posed of a group or packet of monochromatic ])rogressive waves all of 
which have approximately the same wave numbe^r a and the same 
direction of motion. 

In order to study the behavior of such a ])a(*.ket it is convenient to 
resolve SI' into its real and imaginary j)arts and 'I't, respectively. 

^ See footnote 3, p. 36. 

2 If we define F(<r) by the equation 

F(c). ^ if 

it follows from the Fourier theorem that 

09 

^ X is evidently a solution of 

the differential equatipn (91) if a and p arc related by (9-3) and reduces BXt - 0, 
Since the solutions of (9-1) are uniquely determined by their form at ^ome initial 
instant, ^ « 0 (cf, footnote 2, p. 18), x is identical with ^ and the theorem is proved. 



38 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


Let G(o-) be real and positive. If we denote the phase angle 2ir(akr — vt) 
by (p we have 


(j(<r) cos ip da, (9’7) 

== ( 9 * 8 ) 


In evaluating these integrals x and t are to be treated as fixed parameters, 
while V is a function of the variable of integration a. The functions 

sin (p and cos (p are oscillatory functions of a 
with the variable wave length^ 



27r' 


da 

dip 




(9-9) 


The complete integrand of or is there- 
fore an oscillatory function with the 
envelope'^ 

y = ±(7((t), 


as illustrated in Fig. 1. 

If Xa is small in the region M, the positive and negative loops will 
tend to cancel each other and the integrals will vanish to a high order of 
approximation. In this case the monochromatic constituents of ^ 
cancel each other at the space-time point x,t under consideration, ‘‘by 
interference,’^ to use the language of physical optics. Actually X^ is 
small outside the neighborhood of that point a' on the a axis, where 


da 


(9*10) 


At the point in question the wave length is infinite. Then, for given 
values of x and t, ^ has a maximum value if a' is identical with aa, but, 


1 In the case of a monochromatic wave in a one-dimensional homogeneous medium 
the phase angle <p{x) is a linear function of the coordinate x and the wave length is 
then defined by the relation 

~ = coefficient of a; in (a) 

A 

It could equally well be defined by 

^ ~ rate of change of phase with x, (b) 

A OX 

In the case of a nonhomogeneous medium the first definition is inapplicable and we 
define the local wave length by Eq. (b). Then the phase difference of two points xi 
and X 2 is given by 


— ^(xi) « 



Sec. 9] 


WAVE PACKETS AND GROUP VELOCITY 


39 


if <t' and ero arc very different, ^ is sensibly equal to zero. If we now allow 
X to vary, holding t fast, we see that we can always give it a value which 
will bring <t' and o-q together. ‘This value is 



and locates the “center” of the wave packet as a function of time. The 
speed w'ith which this point moves along the x axis is called the group 
velocity and has the value ^ 





(911) 


To determine the form of the wave pa(*ket '^{x,t) more exactly we 
substitute cro + r) for a in (9-4), and make use of (9*3). The phase angle 
(p = 2w{xa — vt) becomes 

or 

r haoH ( hatA h ,1 
^ = 2;r[x<r„ - + [x - --jr, - 

In view of (9*3) we introduce the abbreviation co = hao^/2p Then 
(9*4) and (9T1) yield 

/ -f « 2iri r ~ Vtt)rt - 1 

^ G(ao + 7i)e ^ ^drj. (9T2) 


However, according to hypothesis, G is negligible outside an interval M 
enclosing the point a — cro. In this interval t? is small. Therefore we 

can neglect the term l^he exponent of the integrand in (9T2), 

provided we restrict the discussion to values of t which are not too large. 
With this approximation we have 

•g/ ^ f,2iri{x<ro~ V) G{(To + 

where { s x — vjl. We denote the integral on the right by w(f) : 

^ (9-13) 

1 Equation (9-11) is clearly equivalent to the more usual formula 


For a more elementary discussion of group velocity see A. Schuster and J. W. Nichob 
son, The Thetyry of OpticSy 3d ed., p. 326, London, 1928. The writer is indebted to 
Prof. H. A. Kramers for suggesting the above given method of treating the subject. 



40 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. IT 


u(^) is readily seen to be a modulating amplitude factor with its maximum 
at J = 0, or X = Vf,t On either side of the center of the packet u(^) 
fades off gradually toward zero. Thus the ratio of the amplitude u at an 
arbitrary point x to its value at the center of the packet is given by 

u(P) f ^ 

^ 

Clearly this ratio will b(' quite small if goes through more than 
two cych's, as rj ranges through the interval M. Hence we can roughly 
locate the “ends’’ of tlu^ packet by those values of f for which 
goes through exactly two cycles in M, Let us designate the range of 
values of 77 in Af as 27? 1. The “length” of the wave train is accordingly 
2/771, i.e.j it varies inversely with the range of wave numbers of which the 
train is composed. 

It will be obs('rved that the sj)ecial choice of G(cr), as real and positive, 
gives all the components of i\w packet the common phas(‘ angle zero 
at the space-time i)oint x — 0, t — 0. All components can be given the 
arbitrary phase angle (po at any point x = Xu, t = to if we give G the form 


G(<t) ~ F{(t)v 


;[ 


V»o Toa "."'O 


2 m 


)] 


(9-14) 


where F is real and positive. 


It will be obs(^rv(ul that the fonimla I — 2/rj\ for the leriRth of the wave train 
formed by the ])aeket depends on the special choice of th(^ func.tion tr(cr) and on an 
approximation whhrh must break down for large vahu^s of t. When G is chosen so as 
to give all compoiu^nls the common phase angle v’o at t ~ < 0 , the length of the wave 
train is a minimum at I — U. A slightly differcuit analysis shows that the size of the 
train is a quadratic function of the time. For this purpose we may agree to dc^fim^ 
the length as some multiple of the roth-mean-stpiare value of x — x. Denoting this 
root-mean-square value by Ax, we have 


(Ax)2 ~ J* ^ fx — x)^\^\^(ix. 
If G{a) has the form given above, 


X - Xo (t — to} 




where <ro is the maximum point of F{a). Let w denote the function 

/ harHn \ 

W tpo — 27 rf Xoor ^ b vtj- 


Then 


/ -j- oe 


(915) 


(9*16) 


(9-17) 


(9-18) 



Skc. 10| 


WAVE PACKETS IN THREE DIMENSIONS 


-^xo A- (t - = ^^\x - i)^e-^^^=‘dx. (9-19) 


The left-hand member is readily reduced to the form 




Hence, by Plancherel’s theorem, (9-6), 


/ +«>r 1 ;,2 1 


This equation shows that th(} size of the wave packet varies quadratically with the 
time by an amount proportional to /iVM^ and that the mininuim root-mean-sciuare 


value of — f is 


error function, such as 


we find that, at t = toi 


1 I /•+ « /dF\^ "jki 

^ J-Qo\d^j^ • Giving F(o^) the form of a normalized Gaussian 


a^(a — ao)^ 

F{a) = ^ ^ 


V2 2. 

Our primary conclusion is that the wave packet defined by Eq. (9-4) 
and by the assumed form of the amplitude function G{<t) represents a 
train of waves with the approximate wave length l/o-o [c/. Eq. (9-13)] 
moving forward individually with the phase velocity j'o/o’o, but having a 
variable amplitude such that the whole disturbance is confined to a 
short interval of the x axis whose center x moves forward with the 
group velocity v„. Conversely, it may be proved that if at any time 
^{Xyt) has the form just described, it must be possible to resolve it 
into a narrow continuous spectrum of monochromatic waves such as that 
defined by Eq. (9.4). 

10. WAVE PACKETS IN THREE DIMENSIONS 

The extension of this discussion to the motion of a free particle 
in three dimensions is very simple. The wave equation (5T0) for zero 
potential is solved by setting 


= Qiwiixvz-^-Wy+zirt—vt) 


( 101 ) 


( 10 - 2 ) 


This particular solution represents an infinite plane wave having the 



42 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


wave-number components a*, or^, A wave packet is obtained by 
forming the integral 

///. (10*3) 

in which, for simplicity, we may assume (? to be a real function which 
vanishes everywhere except in a small region around the point <rox,o'oi/,o’o* 
in (7 space. Introducing suitable assumptions regarding the convergence 
of integrals |(j 1 and \G\^y when extended over all a space, we may prove 
the quadratic integrability of Sk. As before, we introduce a phase func- 
tion (p defined in this case by 

<p = 2t(x<Tx + y<^v + Zffz — vt) (10-4) 

and note that the integral (10-3) will bo sensibly equal to zero unless 
Xy y, Zy t are given values which make the wave length of regarded as a 
function of <r*, large at the point <rox,o'ow,<roz, where (? is a maximum. 
It follows that the center of the packet x, 2/,2 is at the point where 

for <r* — cTo*; o-y = (Toy; O', = cToe [c/. Eq. (9*9)]. This is equivalent to the 
requirement that 

We conclude that the center of the packet moves with a constant 
vector velocity whose components are 

' (^1 = 

- W.A. ■ 

In other words it moves along the normal to the median wave of the 
packet (<7o*, <roy, o’er) with the speed 

|t»,l - [(«»,).» + W + = (^)^ ; 

Because wave packets are special cases of quadratically integrable 
non-monochromatic solutions of the second Schrodinger equation, it 




Sec. 11] WAVE SURFACES AND HAMILTON- JACOBI EQUATION 43 


is convenient to apply the term normal packet function to every such 
solution. 

*11. WAVE SURFACES AND THE HAMILTON -JACOBI EQUATION OF THE 

CLASSICAL DYNAMICS! 

The theory of wave packets developed in Secs. 9-10 depends on the 
fact that in the case of free particles we can write down immediately an 
infinite family of monochromatic progressive wave solutions of the 
differential equation in terms of which any quadratically integrable 
wave function can be expressed. In case there is a variable potential 
energy function V, no such family of monochromatic special solutions 
is available. However, we (;an work out a family of approximate solu- 
tions which are applicable in a limited region in the neighborhood of the 
wave packet. With the aid of these solutions the instantaneous velocity 
of the packet can be determined. 

As the phase velocity varies from point to point with the potential 
energy, the problem of setting up an approximate description of a 
monochromatic extended wave parallels the corresponding problem 
in the optics of inhomogeneous media. The approximation which we 
shall make is precisely the approximation involved in using the theory of 
geometrical optics instead of the theory of physical optics. In view of 
the analogy between geometrical optics and classical mechanics drawn 
in Sec. 3 the reader will not be surprised to discover that the same 
approximation leads to a connection between the wave equation and the 
Hamilton- Jacobi partial differential equation of the classical dynamics. 

The Schrodinger equation for a single particle in three dimensions is 

+ V(W)* - y (IM) 

The desired approximate solution is obtained by assuming^ 

2*%A 

^ ^ Fe ^ ^ (11*2) 

where the phase” A/h is a real function of x, y, z, t, while F is either 
real or complex but independent of the time. 

Differentiation yields 

! The author is indebted to Prof. J. C. Slater for inueh of the material in this 
section. 

* C/. L. Bbillouin, Comptes Rendus 183, 24 (1926), J. de Physique 7, 353 (1926); 
G. Wbntzbl, ZeUs. f. Physik 88, 518 (1926). The application of the substitution 
(1 1-2) to the location of energy levels is discussed in Sec. 21. 



44 


CLASSICAL MECHANICS AND WAVE MECHANICS |Chap. II 


Hence Eq. (ll-l) becomes 


hi r 
4^r/xl 


+ V + 


at 


FV,A+2(^A?l + ?d^ + ^A 

^ \dx dx ^ dy dy ^ dz 


s)]- 


Stt^IX 


VW = 0. (11*3) 


If we think of h as a variable and allow it to approach zero we see 
that, in the limit, terms in h and of Eq. (11*3) drop out. The amplitude 
F is then arbitrary, while A must be a solution of 


2m _\dx / 


+ 


/aAV 

\dy / \ dz / 


+ V + 


dA 

dt 


(11-4) 


Reducing h to zero is equivalent to reducing* the wave length to zero. 
Hence the wave function obtained by neglecting the terms of Eq. (11*3) 
in h and should be appropriate to the discussion of problems governed 
by the classical mechanics. Equation (11*4) is in fact one form of the 
Hamilton- Jacobi equation. The other and more familiar form is obtained 
by introducing the additional assumption that 4^ is monochromatic so 
that 


A{x,yyZ,t) = S'ix^yyZ) — hvt = S' — EL 

We then have 


or 


a-) 

I grad <S'l = - V). (11-6) 


In the classical mechanics a function S' which satisfies this equation 
is sometimes called an action function although it is not identical with the 
action function used in the formulation of the principle of least action.^ 

On the other hand, as — Et) is, by Eq. (11*2), the phase angle 

of the ^ waves, it is evident that the ‘‘leveE^ surfaces of S' are wave 
fronts. Hence it is to be expected that each such surface will generate 


^ Solutions of Eq. (11-5) are definite functions of x, y, z, whereas the action integral 

S=ff2Tdt 

depends upon the path of integration and is not a definite function of the coordinates 
of the terminal point B. If, however, we integrate along any one of the lines of the 
vector grad 8% it may be proved that 

S'(B) - S'(A) = = S. 



Sec. 11] WAVE SURFACES AND UAMILTON-JACOBI EQUATION 45 


others in accordance with Huygens’ j)riiicii)le. This surmise is readily 
shown to b(^ correct. Tlie vector grad S' is by definition ortliogonal 
to the level surfaces of S' and, by K(i. (11*6), has the value \/2ijl{E - y)- 
Hence if we choose any smooth surface as a surface of constant S'j 
say S' ~ a, we can construct a neighboring surface S' — a + da by 
going out along each normal a distaiice c^of/lgrad S'\ and connecting the 
resultant terminal points (c/. Fig. 2); or we can make the equivalent 
construction by drawing sphere's of radius da/\gnid >.S'| and forming 
their envelope. This process can be continued, 
setting up an infinite number of sue^^e'ssive^ 
surfaces S' = constant and so integrating the 
differentia] equation (11*5). Since the ])hase 
angle of our apjiroxiinate' wave' function is 

~^(S' — Et), the v(*(dor wave number is given 
by^ 

7=^^ grad 6", (11-7) 

and tlip i)hasf! volocity i.s «'/!<r| or J?/|grad <S'|. 

Hence the radii of the s])hef(\s as defined above an' directly pro])ortional 
to the phase velocities or wave lengths, as is proper in the' Iluyge'ns 
construction. 

In case we do not throw away tlie^ terms in h and h'^ of Eq. (1T3) 
we may still obtain, uikUt c(‘rtain circumstances, an excelh'nt approxi- 
mate solution by means of th(^ substitution T if we give F 

a constant value. Then all the terms in Eq. (1T3) cani'id out or reduce 
to zero except the one involving V-/1. If the gradii'iit of th(^ potential 
energy V(x,y,z) is small, grad S' or grad A will l)e nearly constant in 
magnitude. If the wave fronts are ndativc'ly fiat, it will be nearly con- 
stant in direction. Then V-A = div grad S' will be small and tlu^ 
approximation good. - 



^ Cf. definition of wave length in footnote, p. 38. 

2 The quality of the approximation may be tested by dciterniining tlie pc^nu'iitagc^ 
discrepancy between the actual value of V^xi/ and the value nupiired by the oxa(;t 
differential equation. If the amplitude F is constant, this fractional error in 
reduces to 

hV^A 

^iri]x(E - vy 

L(;t If m, n denote the direction cosines of the w'ave normal and hit 7’ denote the 
kinetic energy E — V. liCt the direction of the wave normal at the point XfUfZ be 
that of the z axis. Then the above fraction becomes 


1 r XdT /dl d7n\l 
~2iri[T dz d7j)i 


Thus the approximation is good provided that the fractional change in kinetic energy 



46 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


A still bottor approximation is obtained if the imaginary terms of 
(11*3) are canceled out by the requirement that F satisfy the equation 

+ 2F gradF • grad A ^ div (F^ grad A) == 0. (11*8) 

In one dimension this equation is solved by setting F equal to a multiple 
of (Igrad A\Y'^''k This approximate expression for the variation in the 
arnjditude is used in the B. W. K. method, described in Sec. 21. For our 
present purpose it suffices to assume that F is constant. 

*12. WAVE PACKETS AND THE MOTION OF PARTICLES IN A FORCE FIELD : 

FERMAT’S PRINCIPLE 

In Chap. I we showed that the rays determined by tlie principle of 
least tim(‘ and the orbits of the classical mechanics become identical in 
form if th(^ local wave length in the wave problem is related to the 
classical local momentum and the energy in the mechanical problem 
by the formula 

X = ^ 

V2m(1 - V) 

In Secs. 9 and 10 we went on to demonstrate, with the aid of certain 
plausible assumptions, that the motion in time of particles in a problem 
in classical me(*hanics is identical with the motion of appropriately 
defined wave packets in the corresponding wave problem, thus establish- 
ing an asymptotic agreement between classical mechanics and wave 
mechanics. The assumption that E — hv, combined with the above 
expression for X, serves to define the differential equation for the wave 
problem (Schrodinger^s equation). The remaining assumptions arc as 
follows : 

a. The rays of geometrical optics defined in terms of wave normals can 
be determined by the principle of least time. 

b. Wave packets travel along the normals to corresponding extended 
waves. 

c. The velocity of the wave packets is given by the group-velocity 
formula (4*1). 

In the case of free particles the wave equation is that for a homogen- 
eous medium and the assumptions a and b are trivial while the work of 
Sec. 10 verifies assumption c. We are now ready to check these three 


in one wave length and the change in the direction of the wave normal in the same 
distan(!C are small compared with unity. In the case of a heavy particle to which the 
classical mechatiics should be applicable, this condition will practically always 
obtain owing to the small wave lengths involved. Exceptions occur when the particle 
is sensibly at rest and when the waves are diverging from, or converging toward, a 
focus. 



Sec. 12] WAVE PACKE7\^ AND THE M(JT10N OF PARTICLES 


47 


assumptions for the general case of waves correlated with the motion 
of a single particle in a force field. 

Let us begin with assumption a. We have to show^ that in the case 
of an extended monochromatic progressive Avave systeni whose wave 
length is short compared with the inhomogoneity of the medium and the 
curvature of the wave front, lines drawn so as to be everywhere jx^r- 
pendicular to the wave front satisfy the equation n 


. rds 

‘I X - 


ds denotes an element of path as usual. We may — \ 
identify the wave fronts with the level surfaers of , 

the function S' of the preceding article, and the direc- 
tion of the wave normal with the din^ction of grad S'. ^ / 

Let the full line AFB connecting the wave fronts 

S' = Sa' and S' — S// in Fig. 3 have the dinn^tion of grad S' at every 

point. For this j)ath of integration 

S/ - S/ = f |gradxS'M.s = /t r ~ (12-1) 

JaFB 

Now consider any other path such as AGB. Let 0 denote the angle 
between the path element and the vector grad S'. Then 


Sb' — Sa — \ Igrad S'\ cos 6 ds = h \ 

JAGB JAi 


< h 

3 A 


( 12 - 2 ) 


Comparing (12*2) with (12T), we see that J^^d'3/\ has a minimum for 

the path of integration AFB which is everywhen^ dirc'ctc^d along the wave 
normal. This justifies assumption a. 

In order to validate assumptions b and c we shall ('xteiid the dis- 
cussion of Sec. 10 to the case of motion in a force field by buildii^ up a 
wave packet from api)roximate wave functions of the form 


u — exp 


2ti{S' 


(12-3) 


where S' is a suitably chosen solution of (11*5). We know from the 
preceding section that there exists such a function S'{x,y,z) which takes 
on any desired value, aSo', on any desired smooth surface' 2). Further- 
more we can choose the positive sense of the gradi(*nt of S' at >2 at 
pleasure and can give |grad S'\ any value we please at any given point 
of space by proper choice of the energy E. Let P be an arbitrary point 
with the coordinates a;o, yoj «o. It is evident that we can choose arbi- 
trarily S'pf (grad S')pf and the form of a smooth level surface X passing 
through P and normal to grad S' at that point. If we agree that X shall 1)e 



48 


CLASSICAL MECHANICS AND WAVE MECHANICS |Chai*. II 


a plane and set equal to zero, there nunain the three components of 
(grad /S')p, say hrj, /if, at our disposal before the function S' is finally 
fixed. We us(^ the thr('(»-i)aramet('r family of S' functions obtained 
in this way to build up the ^^wave pa(‘k(‘t” 

'f. = J I* ^.y,z) - (12-4) 

The integrand of the above function is a solution of (1T5) and (1T4) if E 
is given the value 


E = r(.r„, 2 /„, 2 „) + + v" + r-)- (12-5) 

For each s(^t of values of r?, f, the integrand rej)resents a progressive 
wave which is })]ane in the neighborhood of P and has the wave number 
components a^. = o-,, = t?; = f at that point [cf. Eq. (1T7)|. 

Let })e a real and positive function of its arguiiKUits which 

has a maximum at ^ rj = voy f == fo, vanish(\s outside a small 

region adjacent to this point. Since S' vanishes at P, the constituent 
waves which make up are in perfect phase agreement at P, wlu'n t 
is zero. At this space-time point, |4'i| takes on its maximum value. 
From the argunu^it of Sec. 9 we infer that I'ki] will b(‘ small at ^ = 0 
and at later times except in a region K at whose center li(\s the point of 
best phas(‘ agreement. 

As is composed of approximate solutions of the Schrddinger equa- 
tion (IIT), it must be an a]3proximate solution itself. Consequently we 
can identify the velocity of the point of best phase agreement at tlie 
time t — 0 with the velocity of the center of the wave packet foriiK'd by 
the corresponding exact solution of (IIT). (In view of the alternate 
discussion of th(' motion of wave packets given in Sec. 13 it is hardly 
worth while to attempt any high degree of rigor in the present argument.) 
In order i-o locate the point of best phase agreement at a slightly later 
time A^, we form the derivatives of the phase function S' — EAt with 
respect to r?, .f at the point’ ' ^o,^o,fo and set them equal to zero. In 
calculating these derivatives we make use of the fact that S' has been so 
constructed tliat it has the expansion 

' S' == h^(x — Xo) + h'n{y — y^) + h^{z — 2 : 0 ) + terms in higher powers 

of (x — Xo), etc. 


in the neighborhood of the point P. As we are concerned with the 
motion of the point of best phase agreement in a short interval of time 
we have to do with small displacements and can neglect the terms of 
higher order in (x — Xo), etc. Then 



Sec!. 131 DIRECT RIGOROUS PROOF OF NEWTON'S SECOND LAW 49 


hix — Xo) 



h{z - 0 „) - 


"‘(f I - "■ 


= 0; 


The coniixmentrt of the pii(*k(‘t volooity a-ro afH‘ordingly: 


iVy^x 

(Vy)y - 


K). - 



(12-6) 


These equat ions are identieal witli Eqs. (10-8) for a free V)article and vshow 
that the packc't ]novf\s along the normal to the median^’ wave front 
with a speed given by th(' standard group- velocity formula for homoge- 
neous UK'dia. This completes the justification of assumptions b and c. 


13. DIRECT RIGOROUS PROOF OF NEWTON’S SECOND LAW OF MOTION 

FOR WAVE PACKETS 

The work of S('es. 3, 4, 9, 10, 11, and 12 shows that localized solutions 
of the Sclinidingei’ (‘(piation exist winch follow the kms of ordinary 
mechanics. The method of developing the subject which w^e have used, 
although somewdiat tedious, has the iulvantage of being closely related 
to optical theory and to th(‘ historical d(5velo})ment of wave mechanics. 
It forms a natural t)ackground for the further (daboration of tiie quantum 
theory. There is, however, a more exact and el{‘gant method of deriving 
the classical ecpiations of motion for wave packets. This procedure is 
due primarily to Khnuifi^st.^ The method is ai)plicable to the case of a 
system of particles, but for simplicity tlu' pr(‘sent discussion is restricted 
to the case of a singk' })articl(*. In Sec. 39c, the more general pro})lem 
wdll bt^ dealt wit h by a still different method. 

Let 'E denote a quadratically iiitegrable and normalized solution of 
the wave equation 

+ = ( 13 - 1 ) 


By hypothesis the corresponding probability of finding the particle in 
the element dxdydz is ^^*dxdydz. Let us multiply this probability 
by the corresponding value of the coordinate x and sum over all elements 
of volume. Let the resulting integral be assumed to converge. It will 
clearly represent the statistical mean value of x for a large number of 
^ P. Ehrbnfest, ZivUs. f. Physik 46, 455 (1927); A. E. Rijark, J, Opt. Soc. Am. 16, 
40 (1928), Phys. Rev. 31, 533 (1928); A. Sommerfbld, A.S.W.E.^ p. 2%. 



50 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


observations made on the various systems of an assemblage described by 
y, z, t). Calling this mean value x, we have 

X = J^x^^'^dxdydz, (13-2) 

Similar expressions will hold for the other coordinates. In the limiting 
case of a very sharply defined wav(^ packet, the expected departures 
of the individual measurements of x from the mean will be negligible 
so that X may be identified with the unique value assigned to x in the 
classical mechanics. Hence th(i ecpiations of motion for the various 
mean values should become identical with the classical equations of 
motion in the ease of such a sharply defined packet. 

In particular, we should ex]>e(^t the classi(*a] or average momentum to 
dx 

be /x^. Computing the value of this quantity we obtain 


The value of d'^/dt is given by Eq. (I3T) and that of dA^*/dt by the con- 
jugate equation 


= 0 . 

h dt 


(13-4) 


Multiplying Eqs. (13- 1) and (13-4) by and 4>, respectively, and 

adding, we obtain 

ijrVS'Jr* - 

Hence Eq. (13*3) yields 

~ S S ^ grad '^)dxdydz 

-&///. div [a:(^ grad grad ^)]dxdydz 

' ffij J ' <l-^^)d’«lydz. (13-6) 


The first integral on the right may be converted into a surface inte- 
gral with the aid of Gauss's theorem and this integral will vanish if ^ 
approaches zero rapidly enough at infinity. We suppose this to be the 
case and thus obtain the equation 


dx 

^dt 




dx 


d^*\ 

^-^Jdxdydz, 


(13-7) 



Sec. 14] STATISTICAL INTERPRETATION OF WAVE THEORY 61 

The same assumption regarding the behavior of ^ at infinity yields the 
relation 

J.f + ^^J*)dxdydz = I j j^l^^r^*)dxdydz = 0 . 

Hence Eq. (13-7) reduces to th(‘ form 


Differentiating again with respect t-o t and substituting tluj values of 
and from the wav(' equations (13*1) and (13*4), we obtain 


d'^x 


h 

iiri. 




}i^ 

"SttV. 


~df 


dx\ (H 


dxdydz 


dx 




\l/*V2| 


(S) 


1 SttVt , ±dV 


dxdydz. 


Using Green ^s theonan and discarding tlu' r(‘sulting surface integral 
as above, we obtain tla^ relation 


■ ~ J J - -%■ 

The right-hand memb(u* r(q)resents the mean, value of the classical 
force in the x direction for the pa(;ket. As previously (explained, in the 
limiting c.ase of a sharply defined wave pack(*t x(t) can be identified 
with the unique classical value of x at the time t and under the same 

dT 

circumstances we can identify — unique classical value of 

ox 

the force component in the din^ction x at the time t. Thus the classical 
equations of motion are rigorously derivinl from the wave theory, subject 
only to possible restrictions regarding the way in wliich ^ approaches 
zero at points far from the center of the i)acket. These restrictions are 
of little physical importance since it is certain that wave functions 
exist to which Eq. (13-9) applies, and which therefore are capable of 
representing the actual behavior of a large-sc^ale i)article. They are 
covered by the more general restrictions imposed on physically admissible 
wave functions in Sec. 326. 


14. THE STATISTICAL INTERPRETATION OF THE WAVE THEORY OF 

MATTER 

14a. Review of Assumptions. — The mathematical developments of 
Secs. 9, 10, 11, and 12 will now be used to give a more precise form to the 



52 


CLASSICAL MECIIANTCS AND WAVE MECHANICS [Chap. II 


statistical intf^rpn^tatioii of our theory. To this end we may reformulate 
the physical assumptions made, either tacitly or in so many words, up 
to this point. 

a. Matter, like radiation, combines tlie properties of the waves and 
corpuscles of classifial physics. We may r(‘gard matter as corpuscuilar in 
character provided that we replace the classical laws of motion by 
statistical laws to be formulated in terms of a suitable ^^wave function’^ 
having i)roperties similar in many respects to those of the wave fields of 
classical optics. 

b. WhcTeas in the classical mechanics the state of a system is 
defined by the values of the coordinates and momenta of all its con- 
stituent particles, w(» here assume that the most exact description 
possible for our knowledge of the state of a system is contained in tlui 
specification of a corresponding function ^ of the (‘oordinates and the 
time which determines the relative prol)ability of diffenmt positions or 
configurations for eacdi value of the paramcder In the following dis- 
cussion we shall se(' tluit the function corn'latcnl wdth a syst(‘m is 
not (*ompletely independent of the observer. Hence w^e shall say that all 
systems so prepared that they can Ix' correlated w itli th(^ sam(‘ fumdion 
are in a common subjective state . The objective or true state of an individ- 
ual system is o])erat,ionally undefined. 

In case the probability of very large value's of the coordinate's is 
sufficiently small, the function is (iuadratie*ally integrable and may 
be normalized in accordance with the conditie)n 

J \A'\hlx\dyi • • * dZf, = 1 . 

In this case \^\^dxi • • • dzn is to be inte^rpretexl as the probal)ility of the? 
group of mutually adjacent configurations associated with the' ve)lum(^ 
element dxidyi • • * dZn in configuration sx)ace.^ 

c. The behavior of large-scale bodies as de'seribed by the Newtonian 
mechanics is to be correlated with the motion of appropriate wnve pack('ts 
or localized Af functions which may be thought of as (*omi)Ounded from 
elementary monochromatic and approximately plane- waves having a 
small continuous ranges of frequemies and wave normals. 

d. The energy of a mechanical system and the frequency of the asso- 
ciated w^ave system are related according to the optical rule of Einstein, 

E = hv. (2-1) 

* In view of later developments (<•/. Sees. 19, and 36), it is best to give the phrast? 
“probability of a group of configurations” operational meaning by identifying it with 
the probability that an accurate configuration ineasureineiit will yield a result which 
belongs to the group in question. 

® J.e,, plane over the effective volume of the packet. 



Sec. 14] ^STATISTICAL INTERPRETATION OF WAVE THEORY 53 

e. The vector local i nomen turn of a single particle and the vector 

local wave number a an^ rc'lated, in the limitin^y case where the classical 
mechanics is valid, according to tlu^ de Broglie rule, 

— ► — >■ 

p = h<T, (4-4) 

This relation can he extended to the case of a system of n particles by a 

suitable definition of p and <j as vectors in the 8//-dim(msional configura- 
tion space. 

f. TIh^ wa\^e functions desscribing the ]>ehavior of an isolated system of 
particles jnoviiig uiid(‘r the infiiKmce of conservative forces are to l)e 
solutions of the Schrodinger wave eciuation (7*3), ]). 22. 

14b. Necessity of Introducing Assemblages. — It is n(Hu*ssary at this 
point to (H)nsid(u- soin(‘wliat more narrowly the full iin})ort of these 
hypotlKvses. Let us b(‘gin with postulates a and b. The statistical 
laws and configuration i)robabiliti(‘s referred to in tliese i)ostulates 
presupposes the possibilify of a large mimb(‘r of expiTiinents in which the 
configuration and other i)roperties of the system are observed. To 
say that the probability of a {configuration range A is a means nothing, 
unless it is possible^ to mak(‘ a large number of ex})(‘rim(‘ntal dc'tcsrmina- 
tions of the (‘onfiguratiem. If these' ('Xpe'rimc'nts are possible, the state- 
ment means, roughly speakhig, that wIh'D thenr numbc'r is large the 
fraction of th(' whohe numlxcr of cas(‘s in which the configuration lies 
in th(* range A is clA A single sini])le obsi'rvation can nc'ver test a law of 
probability and hemee such laws can have mcajUTig only in the dtvseription 
of the results of repeatced (‘xperinicTits. rurth(*rmore, if the probability 
of a configuration is a function of tinu', as w(' havee su])posed, the experi- 
ments or observations made to test thc' probability at any one time 
cannot be made on an individual mechanical syst.f'in but recpiire the use 
of an assemblage of similar indeepemdent systems whicdi can be observed 
at corr(\sj)onding times. In fact, even if the probability is independent 
of time, the asscunblage is nececssary owing to the fact that an observation 
involves an interaction bc'twc'cn the sysLmi observ(cd and the obscjrving 
mechanism which inevitably modifies the future bedjavior of the system 
being studied. The configuration probabilities with which we normally 
deal are those of isolated systems or systems moving under the influence 
of known forces not including those used to make the observations. 
Hence the first observation w(? make upon a system initially in some 

^ The concept of probability has no precise operational meaning and is at the 
moment the subject of considerable controversy. Hence, from the operational 
point of view on which this volume is based, it w^ould be better to avoid the term 
altogether. Such a procedure would involve some circumlocution, however, and it 
seems best for our present purpose to treat probability as an unanalyzable concept 
approximately defined in operational terms by the above statement. 



54 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


definite subjective state will ordinarily throw it out of that state. Thus 
we are driv€>n to the use of Gibbsian assemblages for the verification of 
the predictions of our wave functions. 

When such an assemblage has been prepared, we can say that its 
state is definite and objective. Its reaction to future experiments will 
depend only on the nature of the experiments and not on the observers. 
If its past history has been such as to yield a maximum of information 
about the future, that history determines a wave function which predicts 
the statistical results of any future experiments on the assemblage and 
may be considered to define its state. In i)rinciple this wave function is 
experimentally determinable to a physically meaningless constant phase 
factor (c/. Secs. 15/ and 425). By mixing two or more such assemblages 
with different wave functions we get a more general type of assemblage 
called a mixture. We shall have more to say about mixtures in Secs. 415 
and 53d. In the meantime we postulate that the most general possible 
assemblage of independent identical systems is a mixtun^ whose state is 
defined when the wave functions and relative populations of its com- 
ponents are given. 

When a physical system is a member of a Gibbsian assemblage pre- 
viously prepared and having a definite wave function, it is natural to say 
that it is in the state defined by that wave function, More generally 
we can identify the state of a physical, system with that of a Gibbsian 
assemblage of identical systems so pn^parcid that the })ast histories (up to 
the epoch < = 0) of all its members are the same in all details that can 
affect future behavior as that of the original system. Such a definition of 
the state of an individual system is the best we can make but, as previously 
suggested, it contains a subjective element. Knowledge of the history 
of a system prior to a chosen epoch is necessarily subjective and can be 
different for different observers of that system. If two such observers 
A and B with different historical data try to repeat the preparation of 
the system in order to form a Gibbsian assemblage of systems in the same 
state, they will perform different operations and generate different types 
of assemblage. If each is a good physicist, each may deduce from his 
data a definite wave function describing the state of the system from his 
point of view and correctly predicting the behavior of the assemblage of 
similar systems which he has prepared. Now in exceptional cases where 
a wave function states that the j^robability of a certain experimental 
result is zero, the prediction can be disproved by a single experiment. 
In general, however, an experiment performed on a single system can 
neither prove nor disprove any wave function which we may attempt to 
correlate with it. Hence there is no general objective way in which a 
third person can prove that one of the two observers A and B is right 
and the other wrong. Thus it seems best to use the term “subjective 
state’’ in correlating a wave function with a specific system. 



Sec. 14] STATISTICAL INTERPRETATION OF WAVE THEORY 


55 


Whether wo interpret the results derivable from the mathematical 
machiiK^ry of quantum mechanics in terms of previously prepared 
Gibbsian assemblages, or in terms of potential assemblages to be pre- 
pared according to our record of the past history of an individual system, 
we must interj)ret them in terms of some kind of assemblage. We are 
thus led to (‘onceive of quantum mechanics as primarily a variety of 
statistical mechani(!s similar to the classical statistical mechanics of 
Gibbs. 1 

In the optical case it is proper to assume that the statistical behavior 
of the concn^te assemblage of photons actually passing through a given 
piece of apparatus at a given time is the same as that of an ideal Gibbsian 
assemblage of photons correlated with wave systems of similar structure, 
but each moving in its own independent apparatus. This assumption 
is justified by the expc^rimental fact that the behavior of the elements 
of the (*oncretc‘ assemblage is ind(‘pendent of the intc'iisity of the light. 
In other words, the different photons of a bc^arn of radiation have appar- 
ently complete mutual independence. Consequently simultaneous 
observations on the pcjsitions of a large number of photons made by 
allowing a beam of light to pass through dust-laden air, or to impinge on a 
photographic^ plate, may be regarded as the equivalent of a series of 
independent observations on the members of a corresponding Gibbsian 
assemblage of individual photons. Although closely packed electrons, 
protons, etc., are not independent, experience shows that in the case of 
low-density beams of such particles the interactions may be negligible, 
so that we can often replace indepemdent observations on a Gibbsian 
assemblage with a single wholesale statistical observation of a concrete 
assemblage of approximately indeq^endent systems. Ideally, however, 
the predictions of quantum mc'chanics should be tested by a series of 
observations on a suitably prepared assemblage of completely inde- 
pendent systems each in its own separate box or laboratory. 

^ Cf. J. C. Slater, J. Franklin I nut. 207, 449 (1929). 

In calling attention to the parallelism between quantum mechanics and classical 
statistical mechanics we must emphasize that the quantum -mecjhanical study of the 
behavior of assemblages whose statistical behavior can be described by a single wave 
function is not the quantum-mechanical way of treating the assemblages of the older 
form of statistical mechanics. In the latter we have infinitely many sharply defined 
classical states, whereas in the former we have a single quantum-mechanical state. 
The quantum-mechanical generalization of classical statistical mechanics, usually 
called qxmrUum statistical mechanics {cf. Sec. 53), has to do with assemblages of systems 
including many quantum -mechanical states and thus involving many independent wave 
functions. We follow the notation of von Neumann {M.G.Q., p. 158; see list, of abbre- 
viations, p. xviii) in calling such assemblages ‘‘mixtures^' and will return to a brief 
discussion of their properties in Sec. 416. In this chapter the discussion refers 
throughout to the simpler type of assemblage, sometimes called a “pure case,*' all 
members of which belong to a common subjective state with a common single wave 
functiom. 



56 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


14c. Multiplicity of Energy and Momentum Values for a Definite 
State. — Having settled this point, we turn our attention to postulate d 
which may conceivably b(‘ interpreted in two ways. Since the frequency 
V does not have a unique value for any but the most special wave func- 
tions, we must either intei’im't E(|. (2-1) as indicating an indeterminacy 
in the energy of a particle associated with a wave packet, or else we must 
regard the value of v which appears in this equation as a mean valm? 
for the wave function as a Avhole. W(^ choose the former alternative 
as the more natural and logical. It is more natural since wc^ are endeavor- 
ing to construct a mec^hanical theory which parallels the dualistic theory 
of radiation, and since it is well known that the photons ass()(*iated with a 
given electromagnetic wave system may have a A^ariety of energies. 
It is more logi(;al, for it (^an be ])roved a consequence of i)ostulate b 
combined with a reasonalde ex])erimental definition of kiiu'tic energy. 

To verify this last statement let us (consider the case of a wave function 
composed of tlu^ sum of two disturban(*es A'l and ^ 2 , each of which forms a 
typical wave packet with a fairly well defim^d frequency and direction of 
motion.^ In the opti(‘al case such a disturbance^ can be ol)tained experi- 
mentally by allowing a beam of i)Iane-parallel monochromatic radiation 
to fall upon an aperture covered by a shutter. Opening and closing the 
shutter momentarily would form a ])rimary packet whicdi could be split 
into twp parts with different wave lengths and directions of motion 
by suitable partial reflection at the surfac^c' of a moving half-silvered 
mirror. If a disturbance of this kind is associated with a single ];)article, 
whether electron or photon, hypothesis b n'quin's that the particle havf; 
a certain probability of moving with packcit 1 and a complementary 
probability of moving with packet 2, If an expc^rimental observation 
showed the particle to be in packet 1, w^e should have to avssign to it one 
energy and momentum, wiiik' if it werci found in packet 2, w^e should have 
to assign to it a different energy and momentum. Hence, in this case 
the correlation of a single energy and single momentum with the wave 
function leads to an absurd result; and, since no sharp dividing line 
can be drawn between this case and that of a typical single wave packet, 
we conclude that unique energy and momentum values should not be 
assigned to any wave packet. 

Indeed, we know by direct experiment in the optical case that such a 
single packet always contains a range of possible energies and momenta. 
If the packet be formed with the aid of an aperture and shutter as 
suggested above, diffraction will cause the radiation to diverge by an 
amount varying inversely with the dimensions of the aperture. If the 
intensity is large and the aperture is small compared with the wave 
length, photons will proceed in all directions from the slit, carrying 

^ As the Schrodinger equation (7-3) is linear and homogeneous, the sum of any two 
solutions is a solution. * 



Sec. 141 STATISTICAL INTERPRETATION OF WAVE THEORY 


57 


with them momenta directed along the radius vector from the slit. At 
the same time the interruption of the primary beam by the shutter will 
destroy its monochromatic character and s(‘atter the photons over a 
narrow continuous spc^ctrum. Diffraction experiments involving pro- 
longed photographic exposures and very low intensities show that the 
distribution of energy over the ])attern is independent of the intensity. 
We infer that, if the intensity of the radiation in the pac^ket is so low 
that only one or two photons pass through the aperture during its forma- 
tion, the relative probability of each energy and direction of motion is 
the same as if th(i intcaisity were very large. 

The above discussion leads naturally t<* a nnnterpretation or extension 
of postulate e which relates the classi(;al local momentum 

VME -'T(x,y;£)\ 

with the local wave number o- = 1/X = v/w{x,y,z,v). These concepts 
apply, strictly speaking, to classical particles and monochromatic waves 
only. In the (;ase of an assemblages of particles having a variety of 
energies, the concept of local monumtum b(‘c,omes indeterminate, while 
the local wave number becomes ecpially ind('finite when appli('d to the 
wave packet which represents such an assemblage. 

To sharpen our id('as we shall substitutes tins idea of a meMmred 
momentum for that of classical local momentum and the concept of 
sinusoidal wave number and wave hngih for local wave number and wave 
length. Thus mo shall suppose' that a wave system in one dimension 
has a definite wave length if, and only if, it is strictly sinusoidal. Simi- 
larly a wave systc'in in three dimensions will be said to have a definite 

vector wa\'e number a = {(Txj<Tyj<Tz), only if it has the special sinusoidal 
plane wave form 

^ = P ^0(y2Tri{x<r ^ 

As wave packets are in general representable by Fourier integrals, we 
must regard them as compounds of waves having infinitely many wave 
lengths and wave numbers just as they have infinitely many frequencies. 

Furthermore we shall suppose that an assemblage of similarly pre- 
pared particles has a single definite^ monumtum, only if a measurement of 
the momentum of each particle would give a single unique result in all 
cases. As indicated above, such a measurement would lead in general 
to a wide variety of results and so we must consider that the assemblage 
consists of particles having a great many different momenta. 

Now in the limiting case of an assemblage of heavy particles prepared 
in such a manner avS to minimize the uncertainties in position, energy, 
and momentum, the values of momentum obtained by measurement 
would converge upon the value of the classical local momentum cor- 



58 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


msponding to the mean position and energy of the particles in the 
assemblage. Similarly the wave lengths and wave numbers obtained by 
Fourier analysis of the wave packet which represents such an assemblage 
will converge on the local wave length for the average frequency of the 
packet at the center of the packet. Hence we infer that in this limiting 
case we can use the equation 


p — ha 

to corn^late values of the ex})erimental vector momentum and the vector 
wave number obtaiiK^d by Fourier analysis. In the next section it will 
be i)rov(^d that if we make a suitable experimental definition of measured 
momentum, we can use the equation in the same way quite generally. 

16. THE WAVE FUNCTION AND MEASUREMENTS OF LINEAR MOMENTUM 

16a. Operational Definition of Momentum for Free Particles. — In 

the last section we introduc('d the postulate that our knowledge of the 
state of a system cannot be more exactly specified than by a function 
which determines the probability of various configurations as a 
function of time and which describes the statistical behavior of an 
assemblage of identical independent systems all of which an' in th(^ same 
subjective state. We saw further that, just as the systems of such an 
assemblage have a multiplicity of possible positions, so we must think of 
them as having a similar multiplicity of possible momenta. 

The question now arises: How^ can we determine from the wave 
function of an assemblage? of atoms or particles what the probabilities 
of various possible momenta are? The answer to this question must, 
of course, agree with experiment and must consequently have its basis 
in an experimental definition of momentum. The first step toward the 
answer to the above question is therefore to agree on an experimental 
procedure foi the measurement of linear momentum. Whatever 
procedure we accept becomes the ^^operationar^^ definition of momentum 
for quantum mechanics. It must harmonize with the classi(;al definition 
in the limiting case of a sharply defined wave packet and must be so 
formulated that the measurement is feasible even when the classical 
mechanics is inapplicable. 

A considerable variety of classical methods is available for the 
measurement of momentum. Of these perhaps the most fundamental 
are the magnetic deflection method involving a separate measurement of 
the charge on the particle, and the elementary method of measuring the 
velocity directly and multiplying it by the separately measured mass. 
The fact that these meavsurements are not complete in one operation is 
an objectionable feature apparently shared in one way or another by 

^ Cf, P. W. Bridgman, The Logic of Modern Physics, New York, 1927. 



Sec. 15 ] 


MEASUREMENTS OF LINEAR MOMENTUM 


59 


every other method. As the mass and charge are parameters which 
go into the construction of the Schrodinger equation, tlie necessity of 
m(»asuring them separately is ])('rhaps less disturbing than otherwise. 
We here adopt the dire(;t velocity type measurement as the most funda- 
mental procedure and the one best adapted to our purpose. Thus 
our starting point is the formula 


P == fXV. 


(151) 


In order to measure the velocity v direc^tly it is necessary to make^ two 

— ► 

observations of the positional vector r and ap])ly tlu^ classical formula for 
the average velocity, viz., 


V — 


r2 - ri 
U 


(15-2) 


If the particle under consideration is moving in a field-free s})ace, 
we can identify the av(‘rage vedoc-ity with the instantaneous velocity and 
use (15T) and (15*2) to reduce the problem of measuring momentum 
to that of making two successive measurements of position. We encoun- 
ter an immediate difficailty, however, in th(^ fact that evc^ry measurenuuit 
of position involves an unpredictable change in monicuitum. A deter- 
mination of position is usually made by means of radiation with the aid 
of a microscope, telescope, or slit system. It de])eiids upon a collision 
phenonKUion, viz., tlu^ sc^attering or reflection f)f i)hotons by the ])article 
or system under observation. The study of the r(H*()il electrons jjroduced 
in the scattering of X-rays by light atoms (Comi)ton scattering) shows 
that in such a collision th(^ scattering system undergoes a change of 
momentum of the order of A/X for each photon scattered. This change 
in the momentum of the system, whose position is being observed, 
can be made very small by using low-intensity, long wav(' length radia- 
tion, but unfortunately the diffraction of the radiation causiis an uncer- 
tainty in the positional measurement which cannot be reduced Ixdow a 
value of the order of magnitude of the wave length X. Hence the momen- 
tum perturbation is directly proportional to the precision of the observa- 
tion of position and cannot be neglected in the exact location of an atomic 
system. Other methods of measuring position can be shown to involve 
the same difficulty (c/. Sec. 16). 

It will be observed that the change in momentum involved in the 
first measurement of position is the one that makes the difficulty. We 
take the object of the measurement to be the determination of the 
momentum immediately prior to the time h when the measurement is 
begun. As Eq. (15*2) gives the average velocity in the time interval 



60 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


ti < t < t 2 j momentum perturbation at the time <2 do no harm. 
On the other hand, the result of the initial perturbation is that the 
momentum we actually measure differs from that which we wish to 
measure by an amount Ap whose absolute value is of the order of h/Ar^ 

where Ar is the absolute value of the uncertainty in Vi. Fortunately, 
if the particle is in an unlimited field-free space, as we shall assume for the 
present, we can make the initial observation of position a very rough 

one and still get accurate values of v by allowing the time interval to — ti 
to be very large. In fact the accuracy of the momentum observations can 
he pushed beyond any assignable limit if we start with an assemblage of 
particles whose initial position has already been roughly measured and 
make U — ti large enough. Actually the assumption that the position 
of the particles of our assemblage has benm roughly measured prior to the 
measurement of momentum is almost no restriction at all, for our whok^ 
discussion has to do with observations on an assemblage of particles 
whose state is described by a ^ function which can be normalized so that 
we can interpret as the probability of the positions in the volume 

element dr. It is but a slight step farther to require that our wave 
function shall approach zero rapidly (mough at infinity so that the 
integrals which give the average valu(‘ of e^acli of the coordinates x, ?/, z 
and the probable errors [{x — [{y — yYY'^^ {(z — shall 

exist. If the wave function is of this type, it determines the initial 
position of the particle to a certain degre(‘ of approximation and we can 
infer that the initial position is dc^termined to the saiiKj degree of approxi- 
mation by the experimental procedure for preparing the particles which 
mak(^ up the assemblage. Obviously an assemblage of identical particles 
in a common subjective state having the wave function ^ can be pre- 
pared only if there is an experimental procedure for assuring that no 
particle can get into the assemblage which does not have the right wave 
function. This procedure would ac^cordingly constitute a rough measure- 
ment of position mad(^ prior to the measurement of momentum. 

It will now be evident that in the case of an assemblage of identical 
particles prepared so as to have a roughly known initial i)osition and 
moving in an unlimited field-free space (the apparatus used for their 
preparation having been withdrawn) Eqs. (15T) and (15-2) afford the 
basis for momentum measurements which can be carried through, in 
principle, to any desired degree of exactitude. We now address ourselves 
to the task of computing from the initial wave function the probabilities 
of different possible linear momenta for an assemblage of this type, 
reserving for later discussion the case in which the partiejes are moving 
in a force field. 

16b. Computation of Momentum Probabilities from a Wave Function. 

Taking the origin at or near the most probable initial position, we can 



Sec. 15] MEASUREMENTS OF LINEAR MOMENTUM 

replace Eqs. (15*1) and (15-2) by the relation 


61 



(15-3) 


in which r is the measured position at some time great enough so that r 
is large compared with the uncertainty in its measurement. Then to a 
certain dc^gree of approximation we may identify the probability of a 

momentum vector p whose terminal point lies within the element of 
volume dpxdpydps of momentum space^ with the probability of locating 

the particle itself in the corresponding element of ordinary space, viz., 

— > — > 

the element dxdydz = {i/ yYdpxdpydp:, for which r — tp/y. As t 
is made larger and larger, the precision of the measun'ment improves, 
so that we can rigorously identify the limit of this probability with the 
probability of the corresponding measured momentum. 

As a basis for the computation of the probability we use the familiar 
principle of physical optics, that a continuous distribution of elementary 
wavelc^ts (Huygens^ wavelets, or plane waves) destroys itself by mutual 
interfcirence except at points where there is complete or partial phase 
agreement among its elements. This principle of phase agreement was 
used in S(‘c. 12 to demonstrate the laws of motion of wave packets. 

In order to apply the principle to the problem in hand, we may suppose 
the wave packet analyzcnl into a continuous plane-wave spectrum such 
as that given by E(p (10-3). Then, at any space-time point x^y^zd, 
we shall have phase agreement for the elementary range of wave numbers 
d(Txdayd(Tzj provided that the differential of the phase function ^ [cf. Eq. 
(10-5)] vanishes for the range in question. In other words, we must have 

dp = 2t[(x - + (y- + (2 - = 0. 

Thus it follows from tho principle of phase agreement that the value of 
the wave function at the si)ace-time point x,y,z,t dei)ends almost exclu- 
sively on the* amplitudes of the plane-wave (iomponents in the neighbor- 
hood of the one for which 



dv 



(15-4) 


In the case of free particles, v is given by Eq. (10-2). Introducing this 
value into Eqs. (15-4) and solving for hcx, twy, and fer*, we obtain 


Twx 


yx 

T’ 




(15'5) 


^ A space in which the momentum vectors are laid off as displacement vectors 
from the origin of a set of Cartesian coordinate axes. 



62 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


It will bfi proved tliat in the limit of very large values of t the wave 
function depends solely on the amplitude of the j)lan('-wav(^ component 
for which the above equations are satisfied. l''he right-hand member 
of each of tlu\se equations denotes one of the momentum components 
to be assigned to the point x^y^Zyl according to the definition of Eq. (15*3), 
while the left-hand member is the value of the same momentum com- 
ponent given by the basic postulate of Eq, (4*4) (p. 14). We conclude 
that the operational definition of momentum is in accord with Eq. (4 4) 
and permits us to interpre^t this equation as a relation between mc^asured 
momenta and the wave numbers of Fourier analysis. 

Let us now procec'd to the rigorous justification of the statements in 
the preceding paragraph and to the exact evaluation of the probability 
of the range of momentum values dpxdpydpz. Tlu^ first step is to rewrite 
Eq. (10-3) in the form 


^{x,y,z,t) =ff 


2jr/ xax ^ i/ay ^ 




d(T xd/Cr yd(j z • 


(15-6) 


By an appropriate change of variabl(\s and integration by ])arts this 
reduces to {cf. Appendix B) 





vjji 

2{i + l)e^ 




G 



+ negligible 


terms in 



(157) 


This proves that for very large values of t the wave function depends 
solely on the value of the amplitude function G for that point <rx^<Ty^(Tz 
which satisfies Eqs. (15*5). To get the probability dW of a momentum 
in the element dpxdpydpt of momentum space we have to evaluate 
for the point 


x-fe, 


y 


M M 

[c/. Eq. (15*3)], multiply it by the magnitude of the corresponding volume 
element {t/ pYdpxdpydpz, and then pass to the limit of infinite values of iA 
We obtain 


- Hr T 

It is convenient at this point to change the variables of integration in 

Eq. (15*6) from a*, cr*,, c* to px, Py, Px in accordance with the universal 
— * 

relation p ^ hxr. The result is 


/ ^ 1 C C ^ ^fvx Vm Vt\ j - , 


{15-8) 


* The wave function 'ir ig agsumed to be nomalized. 



63 


Sec. 15] MEASUREMENTS OF LINEAR MOMENTUM 

The introdu(!tion of the hiiKition 

^^p~2iript 

throws Eq. (15-8) into the form 

^{x,y,z^t) = ^ (15*9) 

The Fourier integral theorem then yields the symnu'trieal reciprocal 
formula 



l^TTZ 

^(PxjPyjPzd) = J* J^^(x,y,Zyl)e ^ ^ '^''\lxd7jdz (15T0) 

[cf. P]q. (9-5)], and the expression for the probability of the momentum 
range dpxdpydp^ reduces to 

(IW ~ ^^"^dpxdpydpz (15T1) 

In view of Planeherers theorem we may deduce from Eqs. (15-9) and 
(15T0) the additional relation 

Iff. ^4>*dpxdpydpz = ///. ^^^*dxdydz ~ 1 (15T2) 

which is to be interpr(‘t(‘d as a statimient that the sum of tlui probabilities 
of all possible momcuita is unity, as it should be. 

Since 4> plays the same role in determining momentum as plays in 
determining position, it is convenient to introduce Jordan \s iiomencdature 
and designate ^ and ^ as the probability amplitudes for momentum and 
position, respectively. The ap})ropriatencss of this nomenclature is 
evident from Eqs. (8-1) and (15T1). 

The above Eqs. (15T0), (15T1), and (15T2) complete the develop- 
ment of the mathematical method for evaluating the relative probabilities 
of different momenta in the case of a free particle. 

*16c. Momentum of Center of Gravity. — The method is readily 
extended to the problem of the momentum associated with motion of the 
center of gravity of a system of n particles moving under the influemee of 
a potential-energy function which depends only on the relative coordi- 
nates of the system. In this case we define the momentum as the product 
of the total mass M and the vector velocity of the center of gravity. We 
assume as before that the initial position is approximately determined by 
the method of starting the system off. The velocity is then determined 
by a single observation of position made a long while after the starting 
time. In the limit there is a one-to-one correspondence between the 
positional coordinates observed and the momentum values to be assigned, 
so that the observation of position and of momentum are equivalent. 



64 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


In order to compute the expectation of any given range of momentum 
values it is necessary to make a change of independent variables so as 
to express the wave function in terms of the coordinates of the center of 
gravity and a suitable set of relative coordinates for the particles which 
make up the system. Let the absolute coordinates of the n particles bc^ 
Xi, yi, Ziy 0 : 2 , • • * , Zn and let X, F, Z be the coordinates of the center 
of gravity so that 


n n 

MX = '^.y.kXk, MY = '^HkVk, 

n 

MZ == ^fXkZk. 

(1513) 

Let the relative coordinates be 

A: = l 



~ Xk Xiy^ 

1 



Vk = Vk — 2 / 1, 1 
^k = Zk — Zi. ] 

f t- 2 , 3 , 

• • • , n. 

(15-14) 


Substitution of the new variables in the wave function ^o(xi, * * * , 2n,0 
transforms it into a function of the new variables which we shall call 
'^(XjYjZj^ 2 jV 2 ,' • * Thus 

The Jacobian, or functional determinant, of the transformation is 
unity, so that the volume element dxi • • • dzn goes over into the volume 
element dXdY • • • dfn. Since ' * dZn is the probability 

of a configuration in the element dxi • • • dZn, we may identify 

* * • cifn 


with the probability of a configuration in the element dX • • • dfn. 

The wave equation expressed in terms of the new variabh\s is obtained 
by direct transformation and has the form^ 

+ - ^1- + 

M\dX^ dY^ dZ^J drjjdrjk 

;=2 * = 2 


+ "S' 4 . 4 . - V ~ 

^ drjk^ ^ dfk^J 

ife-2 



4Tri ^ ^ 


(1515) 


^ The cross-derivatives can be eliminated if we use as relative coordinates the 
quantities 


k-i /k-i 

^ Xk - ^ fijXj / Mj, n* = • * * . 

/ y«i 


This elimination, however, somewhat complicates the expression for the potential 
energy. 



Sec. 15] 


MEASVREMENTli OF LINEAR MOMENTUM 


65 


Special solutions of this equation may be found by the device of separating 
the variables. One assumes that 'I' has the form 

■ ■ ■ U,t). (15-16) 

Insertion of this expression into Eep (15-15) shf)ws that it gives a solution 
provided that and 'I'r satisfy 


M\dx^ ^ ^ dzA " 


jiri a'Y„ _ 

h at 


(15-17) 




2 m 4- -I- W/ - 4 - ^ 'S' I 4 - 

Hk\a^k^ aritc^ a^tr) ’’ arijarjk 

*'=2 j = ‘2k=i 

. '\ 8x^1’ . 4xz d'lq /irioN 

+ mJ*’ ~ + /. ill - 

respoctivoly. If tho wavo function fac'tors in this nianiuT into the 
product of a fun(*tion (jf th(' coordinate's of the center of gravity and a 
function of tlie ndalive coordinates, the probability of any set values for 
X, Y, Z is independent of the values of the other variables. 

More general solutions of Eq. (15-15) can be obtained by compounding 
diffeu'ent special solutions of Eqs. (15*17) and (15-18). This fact suggests 
the possibility of expressing any quadratically integrable solution of 
Eq. (15-15) in the form 


|'J^Q(7WM«2, - - • ‘""'dP^PydP,, 

(15-19) 

in which Q is a solution of Eq. (15-18) involving Pyy Pz as parameters, 
while Efj is defined by 

7^ 2 4_ P 2 _J_ p 2 

Er = ™ ^ (15*20) 


- ' '''''\U\dPydP^, 


(15*20) 


so as to make the exponential factor in the integrand of Eq. (15-19) a 
plane-wave solution of Eq. (15-17). The conjecture is readily verified,^ 

^ Given an arbitrary quadrati>ally intograblo solution of Eq. (15- 15), we may 
always apply the Fourier integral th(^orem to the variables X, F, Z and so express 
the function in the form 


^ == Ir'''- 


“///. 




If we introducje the function 


^ takes the form of Eq. (15*19). Then substitution in the wave equation (15*15) 
shows that Q must be a solution of Eq. (15*18). 



66 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


and hence the expression (15*19) can be used for the evaluation of the 
probabilities of different possible momentum rangers. 

Now let us suppose that we observe the configuration of the system 
for some very large value of the time t As in the case of a single free 

particle, the vector momentum P is to be determined from the positional 
coordinates X, F, Z in accordance with the relation 


in which R is the position vector for the center of gravity (X,F,Z). 
Applying the theorem of Eq, (15*7) we find that, in the limit of large values 
ofi, ^(X,F,Z,f 2 , * * * fr.,0 is determined by with 

MX r, __ MY ry _ MZ 

^ i ry ^ ^ • 

Hence the parameters P*, Py, P* of Eq. (15*19) are to be identified 

with the components of P. The probability that the variables Px, 
Py. Pz. f 2 , * • * fn lie in the elementary range dPadPydPzd^^ ‘ ‘ ' dfn is 
then 

dw = '^^*^^dPdPydP4!ii ■ ■ ■ du 

= QQ*dPJPydP4^, ■ ■ ■ d^n. (15-21) 

Let dW I denote the total probability of the momentum range 

dPxdPydPz 

for all values of the relative coordinates. This quantity is obtained by 
integrating the expression for dW as follows: 

dW, = dPJ,PydP,jjQQ*dii ■ ■ ■ df„. (15-22) 

Similarly the probability that the individual momentum component P* 
will lie in the range P*' < P* < PJ + dP* is 

dPJQQ*dPydPxdt2 • • • dfn. (15*23) 

*16d. The Measurement of the Momenta of Particles Moving in a 
Force Field. — In the case of particles moving in a force field the above 
scheme of measurement breaks down. The velocity cannot be treated 
as constant, and its instantaneous value can be closely approximated by 
means of Eq. (15*2) only if (a) the time interval <2 — is made short, 
and (6) the two positional measurements are made with great accuracy. 
We cannot in general employ the initial positional measurement involved 
in the preparation of the particles, for that will not always be accurate 
enough. Hence we again meet the diflSculty due to the perturbation 
of the momentum by the initial positional measurement. 



Sec. 15] 


MEASUREMENTS OF LINEAR MOMENTUM 


67 


One e()nceival)lc way of avoiding this difficulty would be to measure 
the velocity of the i)article by means of the Doppler effect which it 
imparts to long wave length reflected light. This procedure fails to 
work out, however, because the frequency of light waves (energy of 
photons) can be measured accurately only if we have to do with a long 
train of equally spaced waves — a condition not fulfilled by the reflected 
light in this case. 

It therefore seems necessary to base our definition of momentum on an 
artificial and apparently impossible experiment in which the observer 
first abolishes the force field and then measures the momentum at his leisure 
by the method previously worked out for free particles! Although we 
cannot actually remove a force field instantaneously, we can in many 
cases remove it so rapidly that the hypothetical experiment described 
may be regarded as a limiting case of a procedure which can be realized 
in practice. Hence this method of observation is in the same class with 
all the idealized experiments used to give operational^ meaning to 
definitions in physics. In particular, we may note that the exact 
measurement of the position of a particle is also impossible in practice, 
and that we can give position an exact operational meaning only by 
postulating a limiting experiment with radiation of infinitely short wave 
length. 

Now it follows from the wave equation that d^/dt remains finite so 
long as the potential energy is finite. Consequently the change in ^ 
during the period of removal of the force field approaches zero as the 
length of that period approaches zero. Hence, if we define the instan- 
taneous momentum as the momentum obtained by suddenly removing 
the force field and carrying out a measurement by the method used for 
free particles, the probability of any particular value will be related to 
the instantaneous form of the wave function exactly as if the particle 
were free. In other words, the probability of the momentum range 
dpzdpydpz is given as before by Eq. (15T1) with the complex Fourier 

amplitude for the momentum p, given by Eqs. (15*9) and (15T0). The 
only difference between the two cases is that, if the particle is not 
free, $ is not a harmonic function of t and hence is not constant in 
time. 2 

‘ See footnote 1, p, 58. 

2 In Sec. 12 we showed that the vector group velocity a symmetrical wave 

packet taken as a whole is equal to h<ro/ny where <ro is the wave number with maximum 
amplitude obtained by an instantaneous Fourier analysis of the packet into plane- 

wave components, haro is then identical with the most probable momentum given 
by the method of observation described abo^e. It is very satisfactory to find that 

—♦ 

the most probable momentum is equal to ftVp, 



68 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


If the force field has a vector potential 0(r/. Sec. 7), the nioincritum is 
defined not by Eq. (151), but by the relation 

(15-24) 

If we suddenly reduce a to zero, the induced eh'ctric force given by 

Ida 
c di 

would in the classical tlieory give an impulse to the particl(‘ just equal and 

c~~* 

opposite to the change in ~a. Thus p is nvilly unatfected by suddenly 

reducing a to zero. The wave function is also unchanged and we con- 
clude that the j)robability of different momentum valu(‘s is given by 
Eqs. (15T0) and (15T]) as before. 

*16e. Individual Momenta of Particles in a System. — Finally, let us 
consider the determination of the individual momenta of the partiel(^s 
ill a system moving under the influence of a potential energy V. For 
simplicity we shall suppose that there are only two jnirticl(‘s in the 
system, the extension of the argument to more general (‘ases being 
relatively simple. In order to permit a nK^asurement w^e shall supi)OS(^ 
that, at an arbitrary instant the potential function and the mutual 
forces between the particles are abolished. We then observe the j)osi- 
tions of the two particles after a long period of time and compute tin' 
momenta as before. For values of t greater than U the wave function 
^(xi, 2 /i, 2:],X2 ,i/ 2,2:2,0 obey the equation 

l/d2 , d2 , ■ l/d2 , d2 , d^V ,47r7d^ 

7^^ ^77?^ d7?r ^ 7\dx? + X 

(15-25) 

Any quadratically integrable solution of this ecpiation can be expressed 
in the form 

1 (* C 2inE{t~U) 2in, , , , 

y • • • J^f?(a. • ■ • yde e * " ' ' ’ da^ ■ ■ ■ dy,, 

* 

where on, jdi, 71 and 0:2; ^2, 72 are to be identified with the momentum 
components of particles 1 and 2 respectively, while 

0^1^ + -j- 7l^ I Cg2^ 4“ ^2^ + 72^ 

2/ii 2//.2 


E 



Sb<;. 15] 


MEASUREMENTS OF LINEAR MOMENTUM 


69 


For very large valuers of t — tn wc can j)rov(‘, in analogy with Eq. (15-7), 
that 


'I' = , 

_ X 




2h(t - U)] l2h{t -lo) 


At2 


j J ^^^\h(t- u;)'h{ji - To)' • • * y X 




It follows that the probability of tlio (‘loiiu'iitary inonientiini range 
da\d(3idyida‘2d(S2(h2 in 

dW ~ G(i*dai • • • dy2 = • • • <^72 

with 

‘JiriEit - /p) 

^ ^ Ge 


To get the })ro]xibility of any individual nioinentiun range dai, we have 
only to integrate' over all valiu's of the other variables. 

16f. Summary: The Determination of the Wave Function of an 
Assemblage. In eonehiding this diseussioji of measured linear momen- 
tum th(‘ following ])oints are to be (‘inphasized: 

a. Lirif'ar monn'ntum has been defined by means of an idealiz(Hi (‘xperi- 
iiH'nt whieh is in agreement with the classical conception. According 
to this definition the momentum is detc'rmined ])y a measurement of 
mass and l)y two positional observations in a field-free spa(;e Hei)arated 
by a large time interval. Th(' first of these two observations may be 
identified with the [)ro(?ess of preparing the assemblage of systems under 
consideration wlii(*h will normally locate the inc'rnbers more or less 
exactly in spa(*('. Th(‘ large time interval is introduced because the 
positional obsc'rvations are iiK'xacd.^ 

b. By hypothesis the behavior in time of an assi'mblage of systems 
with a common initial sul)jective state is df'termined by a wave function ^ 
which ob(*ys the Schrcidingc'r (Hiuation (7 »3) and whose initial form is 
fixeni by experinu'ntal conditions involved in the initial preparation of the 
assemblage. VVe have comi)iued tln^ probability of the various momen- 
tum values in terms of the instantan(>ous form of this function. In 
the case of a free particle, the momentum probability is independent of 
time. 

c. As regards relations between the measured momentum and the 
classical local momentum, we may observe that in the limiting case of a 
sharply defined wave packet to which the classical mechanics is appli- 

^ Strictly speaking, we should note that a positional observation is a correlation of 
a pair of values of positional coordinate and time in which there may bo as much 
uncertainty about thfs tinu' as about the. positional coordinate. Uncertainties in 
time as well as position are rendered unimportant, by the use of the large time interval. 



70 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


cable, the average and most probable values of the measured momentum 
are to be identified with each other and with the local momentum at the 
momentary position of the center of the packet. This follows from the 
work of Sec. 12. In Sec. 336, it will be proved that the statistical mean 
value of the square of the measured momentum for any single energy 
wave function is equal to the mean value of the square of the classical 
local momentum. 

In connection with h the question arises : How are we to determine the 
wave function of an assemblage and to what extent can this be done 
experimentally? To answer this question we note that to prepare an 
assemblage of systems in a definite subjective state we must ordinarily 
start with a natural assemblage whose members are distributed through 
many such states and subject it to a sorting process which will eliminate 
all systems except those in a single, or very nearly single, subjective 
state (c/. Sec. 4ld). In such cases the method of preparing the assem- 
blage experimentally defines the state and we can often determiiu^ the 
wave function which characterizes it by elementary theory. Thus 
any classical scheme for preparing a beam of electrons with a definite 
energy is a scheme for preparing an assemblage whose wave functions 
all have a single vibration frequency. A beam of electrons diverging 
from a point source 0 will be associated with wave functions in the form 
of diverging spherical waves. If we block off all electrons from such a 
beam except those in the neighborhood of a point P which is far from 0, 
the remainder of the assemblage will consist of electrons in states with ^ 
functions in the form of wave packets with their normals approximately 

parallel to OP, 

Suppose, for example, that we wdsh to perform a simple electron- 
diffraction experiment. The electrons can be accelerated through a 
known potential difference V starting from an initial state in which their 
thermal energy is small compared to V, After passing through an initial 
aperture A they can be projected against a diaphragm B containing^ an 
aperture, or apertures, S, Near S the ^ function for the electrons inci- 
dent on B will have the form of a plane monochromatic wave. To 
determine the exact form of the wave function for the electrons emerging 
from S it would be desirable to treat qirantum-mechanically the problem 
of the interaction of the incident wave system and the diaphragm B, 
In practice, however, it would probably be sufficient to adopt the pro- 
cedure customary in the elementary theory of optical diffraction, assum- 
ing that for the emerging electrons, in the, plane of By ^ vanishes on the 
diaphragm and has sensibly the same value at each point of the aperture 
as if the diaphragm were absent. The process of picking out of a natural 
assemblage of electrons a subassemblage with a definite wave function 
is essentially the same as that of preparing a beam or packet of coherent 



Sec. 15] 


MEASUREMENTS OF LINEAR MOMENTUM 


71 


radiation. The phase of the wave function is never determined, and 
in fact we may always regard the assemblage as an aggregate of sub- 
assemblages with which differ in phase but are otherwise the same. 

The above discussion assumes that the wave function of an assemblage 
of x)articles in a definite subjective state is to be determined theoretically 
from the method of preparing the state. It is also possible in principle 
to determine for such an assemblage by repeated positional 

observations on systems which belong to it, using high-frequency radia- 
tion.^ Of course each such observation would alter the energy and 
momentum of the system observed and so exclude it from the original 
assemblage. If all members of the original assemblage were in the same 
state, however, the removal of a random selection of systems from 
the assemblage would not alter that state or the wave function which 
describes it.^ 

It follows that, for a pure case assemblage, is an essentially 

observable function and we infer that d\^{xfy\^/dt is also an observable 
function. Now Fecnberg*^ has shown that the values of and d\^\^/dt 
at any given time t determine ^ itself apart from a trivial constant phase 
factor. Thus, setting aside the physically meaningless phase factor, 
we can say that, in principle, can be determined by positional 

observations. 

Feenborg’s argument for the three-dimensional case — with slight modification — 
is as follows. Lot 

where x and are assumed to be n^al and analytic. Then [of, Eq. (8*6)], 

^ == IL.^ div grad ^ ~ ^ grad 'J'*], 

dt ^TTfXl 

or 

i div (x* grad ^]. (15-26) 

Let and dx^/dt be given for some time, say k. Then the function <p for k is 
determined to an additive constant. To prove this we let u denote the difference 
between two possible solutions <pif <p 2 of the equation 

(¥),. 

We have ^ 

div [x(^o)* grad u] *= 0, 

and hence 

div (wx® grad u) ~ x* grad*w. 

1 In virtue of the Pauli ex(;lusion principle this statement holds even when the 
system contains two or more identical particles (c/. Sec. 426). 

* €f. J. VON Neumann, M.G.Q.^ p. 159. 

» E. Feenbero, ‘‘The Scattering of Slow Electrons in Neutral Atoms,” Thesis, 
Harvard University, 1933. 



72 


CLASSWAL MECHANICS AND WAVE MECHANICS [Chap. TI 


Then, ai)plying tho traiisfonnation of Gauss, an<l observing that the resulting surface 
integral approaclu^s z('ro as the volume over whicli we inti^grat(‘ approaches all con- 
figuration spac(^ as a limit, wcj obtain 


///. 


grad^a dr =■ 0. 


As the integrand is positive and conlinuous, it follows that x grad u must \uuiish every- 
where. But X can vanish only on the nodal surfa(‘es of Ik'tuu^ v must be a (ronstant, 
say a. 

Let xi iiinl X-' denote two possible choices of tlie function consistent with 

a given form for x(^o)^. As theses functions are real and analytic, it is necessary that 


|xa| = Ixil or X 2 = 


In either case the complete wave function 'K/o) = is seen to be completely 

determimul by x‘(bi^ except for a constant phase factor. 

We conclude that the wave fumrtion of an assemblage* known to be in a pure state 
is essentially obscu'vable. 


16. THE HEISENBERG UNCERTAINTY PRINCIPLE^ 

The preceding paragraphs have made it clear that normally an 
assemblage described by a wave function involves a range of positional 
coordinates and moinc'iita. It is not possible for a quadratically inte- 
grable wave function to rejtreseht an assemblage of systems having a 
unique value of the momentum, for single momentum functions are 
infinite plane waves. Thus a reduction in the range of momentum 
values associat(Hi with a wave packet tends to increase tlu' range of 
coordinate values and vice versa. Heisenberg has given this principle 
the following quantitative formulation: 

Let Aqic and Apk denote respectively the uncertainties in the values of 
the coordinates Qk and of the corresponding momenta pk computed from a wave 
function The value of the product AqkApk varies with the choice of 
hut has the lower bou7id h/iw. 

Here we define Aqk and Apk as the respective root mean squares of the 
deviations of qk and pk from their mean values. To be precise, let ^ 
denote the probability amplitudes for momentum defined in the three- 
dimensional case by Hq. (15T0) and let dxp, dr^i denote elements of volume 
in momentum space and the space of the coordinates g, respectively. 
Let the mean values of qk anc^pit be given by 

^ Tk = fj>k\^\‘dTp. 

Then Aqk and Apk are given by the relations 


(Ap*)* = f^ivk - VkWdrp. (16-1) 

‘ C/. W. Hkisenbebg, Zeits. f. Physik 48, 172 (1927); N. Bohr, Nature 121. 580 
(1928). 



Sec. 16] 


THE HEISENBERG UNCERTAINTY PRINCIPLE 


73 


The relation ApkAqk ^ h/^rr is conveniently referred to as Heisenherg\s 
inequality. A general proof of tliis inequality insofar as it applies to 
linear momentum is given in S(^c. 33. Here we content ourselves with 
showing that in the case of a spc^cial type of one-dimensional wave packet 
previously discussed in Sec. 9, in which the distribution functions 
and are Gaussian error curves, the product AgAp has as 

its minimum value. We identify q and p/h with the x and a of Sec. 9, 
respectively. In that article it was proved that if ^ has the form 

appropriate to a free particle, and if G{a) is chosen to give all the com- 
ponent waves which form the integral the same phase at some space-time 
point Xn,^o, the quantity Ao: defined by (9T5) in harmony with (16T) 
takes on a minimum value at i — U. We further provc'd that if G{(t) 
is given the form of Eqs. (9*14) and (9*21), so that the momentum 
distribution function |4>(p)|2 = h~^\G(a)\- has the form of a Gaussian 

1 cx 

error curve, the minimum value of Ax is — y= vy- [cf. Eq. (9*22)]. But 

V2 

Eq. (9-20) shows that under the same cinai instances 


- i/r© - = 


Identifying ( a— with {Ap/hy\ and combining (9*22) with (16*2), 
we obtain (Ax)nrinAp = /?/47r, independent of the choic^e of the parameters 

OLy XOy to. 

Although the assumption that ^ is a solution of the free-particle 
equation (9T) w^as needed in Sec. 9 for a discussion of the motion of the 
packet and of the variation in Ax with time, the reader will readily verify 
that A^rAp == h/^ir holds true at the instant whether the particle is 
free or not, provided that at the instant under consideration 


'^(Xyto) = 


"J-.‘ 


a*((r — <ro^) 


-{2in{x— :rn)<r 


(la. 


The relation At^Ap ^ h/^ir is also valid if we identify q with the time 
and p with the energy of the system. Thus the uncertainty in the time 
at which a particle associated with a wave packet moving along the 
X axis passes the point x = Xi would be approximately equal to the 
uncertainty in x divided by the group velocity dv/dax- 


At = Ax 


— A -2- 


It follows from the energy formula for a free particle, 

p/ + Vv^ + 



74 


CLAHSJCAL MECHANICS AND WAVE MECHANICS [Chap. II 


that, when the partic^le is traveling along the^ x axis (aviTago values of 
Py and pz being zero), the uncertainty in E is produced almost entirely 
by the uncertainty in p^. Then 


and 


AE = Apx 


dpx 


AEAt = AxApx ^ 


47r 


(163) 


This result becomes reasonable when we recollect that in the Hamil- 
tonian theory of classical dynamics t and —E are canonically conjugate 
variables like x and px (cf. Sec. 7c). A plausible extrapolation would 
lead us to suppose that Heiseiiberg\s uncertainty principle is applicable 
to any pair of classically (HJiijugate dynamical variables. Robertson’ 
has actually extfuided the proof to the general case where the conjugate 
variables p and q are functions of the Cartesian (coordinates and momenta 
expressible as i)ower scries in the latter. Some difficulty is to b(' antici- 
pated in the case of still broader gcimralizations, however, for not every 
pair of classically conjugate variables leads to a satisfactory correspond- 
ing pair of conjugate quantum-mechanical variables {cf. Sec. 39a, p\). 
295-298). 

We turn our attention now from the mathematical formulation of 
Heisenberg’s uncertainty principle as a conscnpience of the hypotheses 
of wave mechanics to the (ixperimcmtal implications of the principle. 

Since, by hypothesis, our maximum knowledge of the state of a 
system is given by a suitable wave fun(;tion 4^, it will be evident that, if 
the theory is correct and ol)servations are made on a particle to determine 
the simultaneous values of a coordinate q and momentum p, the experi- 
mental uncertainti(^s in the observed values must also be subject to the 
limiting relation 

{AqAp)^„ui ^ h (16*4) 

The question now arises : Does this theoretical limitation on the accuracy 
of simultaneous observations of (conjugate variables correspond to the 
experimental facts, or is it a weakness of the theory? This question 
has been answered by Heisenberg, who has shown by an analysis of the 
various possible experimental means for determining the values of coordi- 
nates and momenta that the former alternative is undoubtedly correct. 
For example, one can determine the position of a particle by observing 
the direction of motion of photons or electrons which have been scattered 
by collision with it, or one may allow the particle to pass through a 
small aperture in order to locate a point in its orbit. The momentum 
can be determined by successive positional observations in a field-free 

P. Robertson, Phys. Rev. 34 , 103 (1929). 



Sec. 16] 


THE HEISENBERG UNCERTAINTY PRINCIPLE 


76 


space or by the Doppler eflfect. In each case if the alteration in momen- 
tum due to collision with a second particle is taken into account (c/. 
p. 59), and if the ordinary theory of optical diffraction is assumed to 
apply to matter corpuscles as well as to photons, it turns, out that the 
relation (16-4) is verified by the analysis. For the detailed study of the 
various hypoth(itical and idealized experiments used in reaching this 
conclusion, the reader is referred to the original papers of Heisenberg 
and Bohr^ and to the excellent exposition in Heisenberg^s book The 
Physical Principles of the Quantum Theory. 

Of course the analysis of Heisenberg^s mental experiments is based 
on just those empirical facts regarding the Compton effect and the 
diffraction of mattc^r which the wave mechanics is designed to describe. 
Hence the two methods of deriving the relation (16*4) are not altogether 
independent, but it is a source of satisfaction to find that the uncertainty 
principle can be (\stablished by very elementary considerations as well 
as from the detailed mathematical structure we have built up. 

To avoid misunderstanding, a few words are introduced here about the 
meaning of the concept observation.^^ A more complete treatment of 
the subject will be given in (Jhap. IX. 

When an atomic system is observed’^ it interacts with an observing 
mechanism. If a certain type of observation is carried out for each 
member of an assemblage of similar systems, the assemblage will be 
divided, in general, into two or more subassemblag(vs according to the 
outcome of the individual measurements. Thus, in the simple case of a 
beam of electrons radiating from an initial apertures A toward a diaphragm 
containing a second aperture S, the diaphragm is the esscmtial feature 
of the observing mechanism and divides the electrons into two classes, 
viz., those that get through and those that do not. We discover electrons 
of the former class beyond the diaphragm and say that they ^Tiave been 
observed to pass through Obviously the 'V fuiKjtion which char- 

acterizes the future Ix'havior of the subassemlfiago of electrons which 
have passed through the ai)erture is different from that which char- 
acterized the behavior of the complete assemblage leaving the initial 
aperture A, The discontinuous change in the wave function describing 
the subjective state of a system, which occurs when the system is meas- 
ured, is called the ^^reduction of the wave packet” and will be discussed 
more fully in Sec.4 Ic?. The relation (16-4) applies to the values of (AgAp)niin 
associated with either wave function, but not to mixed products involving, 
say, the Ap value for an assemblage before measurement and the Aq 
value for a subassemblage after measurement. Such mixed products can 
be made as small as desired but are of no particular physical significance. 

Although a wave function ^ defining the subjective state of a system 
does not in general permit a unique prediction regarding the result of any 

1 See footnote 1, p. 72. 



76 


CLASSICAL MECHANICS AND WAVE MECHANICS [Chap. II 


particular observation or measurement of the system under considera- 
tion,^ it does give, as we have repeatedly stated, the maximum obtainable 
information regarding the system. From the standpoint of classical 
mechanics the state of a system is defined by the instantaneous values 
of the coordinates and their corresponding momenta. Heisenberg^s 
uncertainty principle shows that actually this “classical state can 
never be known and hence has no operational meaning. The conception 
of a particle with a uniquely defined position and momentum is a very 
natural and useful extrapolation or idealization of everyday experience, 
but in the last analysis we are driven by the experimental fa(‘ts to the 
conclusion that it has no place in the theory of atomic physics. 

As Bridgman^ has pointed out, our inability to make simultaneous 
exact observations of position and momentum may be referred to our 
inability to trace out the details of collisions between photons, electrons, 
and apertures, or other more complicated systems. This inability, 
in turn, is due to the absence of tools finer than complete collisions for 
making the measurements necessary to give reality to such details. 
The importance of this observation lies in the light which it sheds on the 
nature of the new experiiiKuital discoveries which would be needed to 
restore the validity of the classical conceptions of position and velocity. 

It should be emphasized, however, that since simultaneous exact 
valuers of positional coordinates and their conjugate momenta are 
actually unthinkable in terms of wave-mechanical concepts, the discovery 
of experimental means for breaking down the Heisenberg um^ertainty 
principle would not lead to an elaboration of our present quantum 
mechanics but to the complete destruction of its essential features. 
Hence we may regard the experimental evidence for the validity of 
quantum-mechanical predictions as evidence that no such violations 
of the uncertainty principle will ever be found. 

The universal interest which the unc^ertainty principle has aroused 
among physicists and philosophers is due to its intimate connection 
with the hypothesis of indeterminism. So far as physics is concerned, 
this hypothesis may be said to have originated in the discovery of the 
law of radioactive decay. It received important support from Einstein^s 
speculations on the transition probabilities which govern the jumps of 
atoms from one energy level to another and is the cornerstone of the 
present theory (c/. Sec. 2). Classical determinism is the doctrine that, 
if the present state of an isolated system (eventually the universe) 
were l^nown, the future behavior of the system could be exactly foretold 
by a sufficiently expert mathematician. This view is incompatible with 

^ The same information is contained in the probability amplitude ^(Px,Pv,p*,0 
of Sec. 15 and in any of the infinite number of probability amplitudes obtainable from 
by an appropriate change of variables. 

* P. W. Bbidgman, Harper^ 8 Mag., March, 1929, p. 443. 



Sec. 161 THE HEISENBERG UNCERTAINTY PRINCIPLE 77 

quantum mechanicK, which not only denies tlio validity of the hypo- 
thetical (calculation but. through the jirinciplc of uncertainty asscerts 
that the information required by the supermathematician is both unob- 
tainable and meaningless in terms of operations which we can perform, 
even in principle. 

Although quantum mechanics is thus essentially indeterministic, a 
modicum of determinism remains in the fact that the ^ function of an 
isolated system is uniquely determiiKcd for all time by its initial form 
(c/. Sec. 5). Indeterminism comes into the scheme in the relation 
betwccen the wave function and the results of individual experiment, or 
measurement. As measurements always imi)ly interaction of the system 
under observation with an ('xtcrnal aj)para<.us, it is possible to argue 
that the root of the indettcrministic form of (luantum mechanics lies in the 
apparently inevitable division of the universe into “observ('d” and 
“observer.” 



CHAPTER III 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS 


17. BOUNDARY AND CONTINUITY CONDITIONS; EIGENVALUES AND 

EIGENFUNCTIONS 


It is a w(‘ll-knowii (unpirical fart that ea(^h spc^eios of atom has a 
chararteristic set of discn^to enerj^y levels and asso(‘iated “stationary ’^ 
states. The energy difreren(*(^s of these states determine the frcHpieneies 
of the spectrum lines and the so-called transition probabilities for “jumps 
between the levels determine the intensities. Perhaps the most funda- 
mental task of quantum mechanics is that of locating these levt^ls and 
evaluating the transition probabilities on a purely theoreti(;al basis. 

Since we have already identified energy and frecpiency, it is clear that 
the state of an atom having a definite (Uiergy must b(' given by a mono- 
chromatic solution of tho corresponding Schrodingc'r equation. If the 
internal energy is to be definit(% the wave function for the atom must 
factor into the ])roduct of a function of the coordinates of the center of 
gravity, and a function of the relative coordinates, Eq. (15T6)]. 

Then must be a monochromatic solution of l^q. (15T8) and its ampli- 
tude fuiKjtion must be a solution of the time-free wave equation 

n k n 


yi (-^ + ^ .+ 


k-2 


dr\k^ 




y -2 fc -2 


32 a* 32 
drijdvk d^jd^k, 


> 


+ - V)ir = 0. (17-1) 


When this equation is applied to an atom with its single heavy nucleus and 
group of / = n — 1 identical eh^ctrons, it is convenient to identify the 
nucleus with i)article 1. The coordinates rjky h are then the Cartesian 
coordinates of the (dectrons referred to a set of axes with the origin at 
the nucleus. Usually the situation is idealized by treating the nuclear 
mass as infinite in comparison with the electronic mass jjl.^ Then the 
above equation reduces to 




I> 


+ - 0 (17-2) 


Here the summation is to be extended over all the electrons; they may be 
renumbered so that k runs from 1 to /. 

^C/., however, the correction for the finite mass of the nucleus given by D, 8. 
Hughes and C. Eckart, Phys. Rev. 86, 694 (1930). 

78 



SBC. 17| 


BOUNDARY AND CONTINUITY CONDITIONS 


79 


In order to get discn^to energy levels from K{j[. (17*1) we must seek a 
class of solutions whi(‘h satisfy appropriat(‘ly (diosen boundary conditions. 
Otherwise all energy values would stand on the same footing. The 
boundaries to be considered are the ‘‘i)oiiit'^ at infinity and the points 
or domains where thci })ot(ritial energy becomes infinite. Such domains 
are called singular domains^ of the difTereiitial equation and include all 
points of configuration space which locate any two parti(d(\s at the same 
point of ordinary three-dimensional si)a(;e. There is a mark(‘d teuidency 
for solutions of the differential equation to become infinite at the bound- 
ary points, and thcrc^fore Schrddingcr introduc(‘d as a boundary condition 
the requirement that 'I' shall be finite everywhere. This condition 
usually gives (Uiergy values in conformity with experimental results, 
but does not apply to the Dirac ndativistic theory of tlu' singles eh'ctron, 
and is too much of an ad hoc proposition to be wholly satisfactory in 
any case. 

A better motivated condition is that of quadraiic iniegrahility ^ by 
which we mean the requircunent that when extended over all 

configuration space shall converge. This condition is fe)rced upem us, 
as we have already seen (e*/. Sec. 8), by the fundamental hypothesis 
that ^^"^dr measures tlie pi’obability that th(‘ system represented by 
has a configuratiem belonging to the element dr of (configuration space. 
In th(^ nenghborhood of the point at infinity the quadratic integrability 
requirement is stronger than the Schrodinger boundary condition, but 
at finite singular points the Schrodinger cemdition is stronger. 

For the present it will be convenient to make use of both of thc^se 
re'strictions and of the following continuity c(>ndition. 

Continuity Condition: All ivave functions arc required to be continuous 
and single-vaViied, and to possess continuous first and second derivatives 
with respect to the Cartesian coordinates in the neighborhood of every interior 
point of the region of definition of the Schrodinger equation (f.c., in the 
neighborhood of every nonsingular point). 

Functions which satisfy the continuity conditions and are bounde^d 
and quadratically integrable will be referred to as functions of class or 
type A. Such functions will also be referred to as ^‘physically admissible, 
by which we indicate that they are adapted to the description of the 

^ The existence theore^m for an ordinarj" linear differential ecpiation of the se(;ond 
order such as Eq. (234) below states that if the co(cfficients po, pa, p are real and 
continuous in the neighborhood of a point x - a, and if po does not vanish at that 
point, there is a solution continuous in the neighborhood of x == a and satisfying the 
arbitrary initial conditions 

VKo) = 2 / 0 , y’{a) = 7/0 . 

A singular point for such a differential equation is define^d as* a point at which one of 
the conditions for the establishment of the existemee theorem ceases to hold. (C/. 
E. L. Incb, Ordinary Differential Equations^ pp. tt9, 72, 160, 3r)6-'370, liOndon. 1927.') 



80 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap, III 


behavior of quantum-mechanical physical systems. In Sec. 326 the 
question of the exact nature of the boundary-continuity conditions 
appropriate to physically admissible functions will be reconsidered and 
new and more stringent requirements will be laid down. 

In addition to the type A functions we shall have use for solutions of 
the Schrodinger equations which are not quadratically integrable and 
hence not physically admissible in the above sense, but which do satisfy 
the continuity condition and can be compounded by integration to build 
up wave-packet solutions of the Schrodinger equation (7*3) which are 
physically admissible. We shall designate functions of this latter 
variety as functions of class or type B. The i)lane-wave functions of 
Chap. II are simple examples. 

The calculation of the energy levels is thus reduced to a boundary- 
value prol>lem of a type long familiar to mathematical physicists. (The 
simplest example of a classical problem of this type is the determination 
of the frequencies and modes of vibration of a stretched string with 
fixed end points.) Type A solutions of Eqs. (17*1) and (17*2) exist only 
for certain discrete values of E. In addition there are type B solutions 
for a continuous range of E values, which usually extends from a finite 
lower limit to infinity. All of the energy values which meet the mild(T 
condition B are called by various authors ‘^characteristic values,’^ 
“proper values,^^ or “ eigenvalues and can be identified with energy 
levels of the atom (or molecule) under consideration. The corresponding 
solutions of the differential equation are called “characteristic functions,'^ 
“proper functions,^^ or “ eigenfunctions.’^ The problem of determining 
a set of eigenvalues and eigenfunctions from a diffc^rential equation is 
closely paralleled by certain problems in the solution of simultaneous 
linear algebraic equations involving a parameter. Hencft we spe^ak of 
“eigenvalue problems” in algebra as well as in differential equations. 
Thus the determination of the lengths and directions of the principal 
axes of a general ellipsoid is an algebraic eigenvalue problem while the 
determination of the frequencies of vibration of an elastic system is a 
familiar type of- eigenvalue problem based on the differential equations 
of classical physics. Still another type of eigenvalue problem arises 
in the domain of integral equations, into which differential equations 
can often be converted. 

In this chapter we make contact with a highly developed field of 
classical mathematics and mathematical physics. The author’s endeavor 
is to sketch in as brief a manner as possible those elements of this older, 
but still developing, mathematical structure which are essential to 

^ Half-translated from the German EigenwerteP We shall use the terms **eigen- 
value” and eigenfunction^^ in conformity with the custom of many other English- 
speaking physicists although we regard the adoption of such a hybrid nomenclature 
as unfortunate. 



Sec. 18] THE ON E-Dl MEN SIGNAL ANHARMONIC OSCILLATOR 


81 


quantum mechanics. Readers to whom (‘igcmvalue problems are 
unfamiliar are urged to supplement the account here giv^en with extensive 
reading in the Tnathematical texts referr(>d to in footnote 1, p. 90. More 
advanced students are referred to the highly abstract but systcnnatic 
general attack on the eigenvalue problems of the quantum theoiy (the 
problem of the parametric Schrbdinger equation is not the only one!) 
in von ]Seumann\s book Mathematisdia Grundlagen der Quardenmechanik 
(c/. book list, p. xviii). 

18. THE ONE-DIMENSIONAL ANHARMONIC OSCILLATOR 

As a first example of the (uiergy-level problem w(‘ consider the special 
case of a i)article of mass vibrating in one dimension under the influence 
of fonres derivable from a potential-energy function y(x), W(i assume 
that V{x) has a single minimum jdaced at the origin for convenience, 
a pole at x ~ ^ , and a finit(‘ asymj)totic limit 1) at a: = + <» (cf. 



Fig. 4. — Potential energy and integral curve graphs for anharinonic oscillator. 

Fig. 4). In discussing this problem it is convenient to consider three 
cases corresponding to different ranges of em^rgy values, as follows: 

Case I: E < F,„in = 0; Case II: 0 < A' < Z>; t'ase III: D < E. 

Case I is classically impossible. As we shall see, it yields no solutions 
which conform to the boundary conditions. Case II yields oscillatory 
motions in classit^al theory and gives a series of discrete quadratically 
integrable eigenfunctions in quantum mechanics. In Case III the 
classical motion is aperiodic so that if one knew the energy of the particle 
atid nothing more, the expectation of finding it in any finite interval of 
the X axis would be nil. The eigenfunctions for Case III arf‘ of the 
type R, not quadratically integrable, from which w"(^ again infer that the 
expectation of finding a particle of sharply defined energy greater than 
D in any finite interval of the x axis is zero, being the ratio of the integral 
of over the interval in question to the integral of the same quantity 
from — 00 to + 0 C, 



82 


ON E-Dl MENTION AL ENERGY-LEVEL PROBLEMS [Chap. Ill 


The classi(!al local monientuiii and corresponding local de Broglie 
wave length are given by the fonnnlas 

V{x,E) = (18-1) 

The corresponding wave equation (5*9) for the sj)ace factor ^ reduces to 
tlie form 

- y)'!' = 0- « = (18-2) 


X has a miniiniim value at x = 0 and becomes infinite at points where 
V(x) is equal to E (e.g,, the points or' and x" in Fig. 4). Outside the 
region G in which y(.r) is l(\ss than E the formulas yield imaginary 
values of th(‘ wave length and monumtum. Moreover, if we compute 
t he kinetic eiuTgy by the plausible formula 


T - 


E 

2n 


E 


F - - 


Krp 


we obtain a negative value. 

Such regions of wiaginary classical local momentum are excluded in 
classical niechanicvs but play an important part in the present theory. 
The wave functions of our problem are spread out over the whole of the 
X axis including the part outside the region (7. It follows from the physi- 
cal interpretation of that if the particle is in a state of definite emu'gy 
E, there is a positive probability tliat a measurement of its position will 
discover it in some part of the region of imaginary local momentum. 
This does not mean, however, that measun^d momentum as defined in 
See, 15 can be imaginary, or that the measured kinetic energy, as com- 
puttd from the measunnl momenta, can be negative. The relative prob- 
abilities of different measured values of these quantities are not point 
functions of the coordinat(\s but properties of the complete wave function 
of t.he system. 


19. THE QUALITATIVE BEHAVIOR OF THE INTEGRAL CURVES 
EXISTENCE OF CLASS A EIGENFUNCTIONS 

19a. Behavior of Integral Curves in Regions of Positive and Negative 
Kinetic Energy. — ^As the coefficients of Eq. (18'2) are real, the real and 
imaginary parts of every, eigenfunction must be eigenfunctions when 
taken separately. Any two type A or type B solutions of the equation 
for a given energy value can be proved to be linearly dependent (i.e., 
one is a multiple of the other), and hence any eigenfunction can be 
resolved into the product of a real function of x and a complex constant. 
Consequently it suffices to search for real solutions of the problem which 
have the advantage that they can be represented by single integral 
curves. 

1 C/. F. HtJND, ZeUs. /. PhyHk 40, 742 (1927). 



Sec. 19] QUALITATIVE BEHAVIOR OF THE INTEGRAL CURVES 


83 


Since the equation is of the second order, every solution is char- 
acterized by two constants of integration. It follows from a fundamental 
existence theorem that an integral curve repres(‘nting a real solution of 
the differential equation (*an be passed through any point of the 
plane with an arbitrary slope at that point. The ordinate and slope 
of the curve for the given abscissa may be identified wdth the constants 
of integration of the solution. In general, however, such an integral 
curve will not satisfy tln^ boundary conditions imposed on eigenfunctions 
and we have to pick out. from the complete manifold of integral curves 
for all values of E th^)se which do conform to the boundary conditions. 

Integral curves are ()])viously concave to the x axis wdiere is 

* xj/ aX' 

negative and convex to tlu' axis where it is positive. Hence can 

see from the differ(‘ntial e(|uation that inside the i‘(‘gion fr, where E is 

greater than V(x), the curves are oscillatory, crossing and n^'rossing 

the axis (cf. Fig. 4). In the regions F and 11 to the left and right of G, 

on the other hand, the curves are convex to the axis. Hence none of 

them can cross the axis more than onc(‘ in either of th(\se regions. The 

nodes of the integral curv(»s and the points x' ajid x" which separate G 

from F and H are i)oints of infh'ction. 

If D — E is positive', so that the n^gion II n'ally exists, evx'ry solution 
of the differential eepiation must be^’orne* itifinite, or approach zero 
monotonically, as x iiu'reases from x" to . Similarly, (^vei;y real 
solution of the diffen'ntial ('(piation either Ix'comes infinite, or approaches 
Z('ro monotonically, as x decreases from x' to — oo . It follows that every 
type A eigenfunction must vanish at the ^Tioundary points x = ± oo. 
(Conversely, every solution of the diffea’cntial equation which vanishes 
at X = ±00 is (piadratically integrable and Ik'ik’c a type A eigenfunc- 
tion. By an appropriate' choice of constants of integration a and /3 we 
can always (independent of E) ])ick out an integral curve which meets the 
above boundary condition at one of the boundary points, say x — — oo . 
Idc'Titifying a and jS, n^spectively, witli the ordinat<» and slope of the 
curve at some point x == f in the region F, let us consider the family of 
integral curves obtained by holding a and E constant while varying 
For small positive values of the curve y = i/'(x) will first approach the 
axis as x takes on increasing negative values and then recede from it as x 
approaches — » . For large positive values of 13 the curve is sure to cross 
the axis and ^ becomc's negatively infinite as x approaches — qo . For a 
(pertain unique value of say there must be an intermediate curve 
w’^hich approaches zero monotonically as x approaches ~ °o . There is 
then one, and only one, integral curve for each pair of vahK\s of a and E 
which vanishes at x = — oo . ^ If we multiply the corresponding solution of 
the differential ecpiation, say y = by an arbitrary constant, we 

^ These statements are rigorougily proved in Appendix C. 



84 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


obtain a new solution with the sann^ value of E which also satisfies the 
boundary condition at — qo . AVe cannot in this way affect the behavior 
of the integral curves at the positive end of the x axis, however, and must 
look to a variation of the energy E for aid in pi(‘.king out integral curves 
which meet the boundary conditions at both + oo and — co. 

19b. The Discrete Eigenfunctions.- -Consid('r next the way in which 
the adjustment of E affects tlu' integral curves. Let the fuin^tions 
\pi{x) and denote integral curves for the energies Ei and E2, respec- 
tively, both of which satisfy the left-hand boundary condition. If 
El > E21 the curve rAi(x) will oscillate more rapjdly in G than ^2(^) 
and its nodes will be closer together. Furthermore, it is easy to s(ie 
that the first node of \pi{x) (not counting the node at — 00) lies to the 
left of the first node of \f/2(x). Hence each node of the former function 
will lie to the left of the corresponding node of the lat ter.^ As E increases 
continuously the nodes move to the h^ft, new nodes ai)pear at x = + 
and ent(T the rc^gion G where tlu^y accumulate. The first appearance 
of each node at x = +00 marks a corresponding eigenvahuj for E and 
eigenfunction 

The minimum number of nodes between the boundaries is zero and th(3 
eigenfunction having no nodes except those at tlu' boundary points has 



Fig. 5. — Eigoiifunctious of the anharnrionic oscillator. 


the lowest eigenvalue of E, Since a continuous curve with nodes at — 
and + 00 must be concave to the axis somewhere, the minimum eigen- 
value must be greater than the minimum value of F(ic). Thus the 
eigenvalue-eigenfunction problem has no solutions for Case I. 

Clearly any two eigenfunctions having the same eigenvalue must 
have the same zeros and can differ only by a multiplicative constant. ^ 
For our present purpose they can be identified with one another. If 
we number the eigenfunctions in the order of their energy values starting 
with zero, the nth of these functions will have exactly n internal nodes 
dividing the x axis into n -|- 1 parts. In this theory the number of 

^ For a more exact discuBsion of the displacement of the nodes as E increases see 
Riemann-Weber, D.P., vol. I, p. 291. 

* This is not usually true in multidimensional problems. 



SEr’. 19] QUALITATIVE BEHAVIOR OF THE INTEGRAL CURVES 


85 


nodes plays a role analogous to that of the integral quantum numbers of 
the Bohr theory. It is an integer whieh is used to identify the eigenvalues 
and eigenfun(;tions, giving the ordinal number of the former when 
arrang’ed in a series according to magnitude. 

Figure 5 shows the qualitative form of the eigenfunctions for some of 
the lower eigenvalues. 

19c. The Continuous Spectrum of Class B Eigenfunctions. — Th(» aliove 
discussion covers the energy range of Case 11. In Case III, where E 
is greater than D, the integral curves cannot be made to vanish at = oo . 
For large values of x we have approximately (c/. p. 87). 

= -aV- - D) > 0. 

Hence ^ has the asymptotic forrn^ 

\l/ ~ C cos (ao: — e), 

appropriate to a free partiel(% and we cone, huh' that it cannot be made 
quadraiically integrabh^ for any value of E greater than D. 11ms the 
spectrum of discrete energy h'vels whose eigenfunctions satisfy the 
boundary conditions A terminates hi E = D. However, the asymptotic 
form C cos [oi{E)x — €(.^')] shows that all solutions of the wave equation 
for positive values of E — D remain finite as x approaches oo. By 
adjustment of the constants of integration we (?an obtain a solution for 
every energy whi^’h vanishes at x = — oo and has any d('sired amplitude 
C(E) at a; = + ^. If C{E) is made a continuous function, the integral 
curvets derived in this way form a one-parameter fajuily \l/{x,E) which 
depends continuously on E, The integration of this family over a range 
of energy values gives a function which is readily shown to vanish 
at x = + 00 as 1 /.T, and wdiich is therefore quadratically integrable. 
Thus, using the asymptotic form we have for large values of x, 

A^ = f^^C{E) cos [ax — €]dE — f"’‘A(a) cos ax da -f C*^'B(a) sin ax da, 
JE\ Jax Ja\ 

where 

^(a) = U{E) = C{E) cos € B{a) = V{E) = C{E) sin € 

Successive integrations by parts yield an expansion in inverse powers of x: 

^ \\dA ,dB. 1 

+ • • * . 

This expansion shows that A\l/ is quadratically integrable. It follows 

^ For a rigorous discussion of the asymptotic form of ^(a;) see Theorem II, Appendix 

H. 


At// = -[A (a) sin ax 

X 


B{a) cos ax] 


oci 



86 


ON E-Dl MEN SIGNAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


that 4^iXyE) is a class B function. (By making C{E) a diff(>rentiable 
function which vanishos at E\ and J? 2 , wc can eliminate the terms in 
1/x in the above scenes. In fact we can apparently make approach 
zero at infinity as rapidly as any desired power of \/x by choosing C{E) 
as an analytic function which vanishes together with its first n derivatives 
at El and E^. This conclusion, however, takes inadequate account 
of (he discrepancies between \l/{x,E) and the corresponding asymptotic 
B. W. K. ap])roximation.) 

The continuous range of energy values for which E > D wdll be 
referred to as the continuous energy spectrum and the corresponding 
type B solutions of the wave equation will be calk'd continuous spectrum, 
or type B, eigenfunciions. This term wdll be applied to both the space 
factor ^ and the completes wave function ^ 

If the potential function V(x) is infinite at both extremes of the x axis 
the continuous energy spe(*.trum disappears and the discrete spc'ctruin 
has an infinite range of values. A special case of interest is that of the 
ideal linear oscillator where V = }' 2 kx^. This problem is discussed in 
detail in Sec. 20. The same remark holds good if V(x) has two poles 
of the second or higher order at finite points Xi,X 2 j and if we c.onsider 
solutions in the range Xi < x < X 2 . In this case ^ must vanish at the 
singular points or become infinite there. The forms of the lower eigen- 
functions remain essentially the same independent of the behavior of 
V(x) at points far from its minimum. 

19d. The Paradox of the Nodes. — The existence of nodes in the 
eigenfunctions of the wave equation may be disconcerting at first sight, 
since it means that if we determine the positions of an assemblage of 
linear oscillators all of which have a common definite energy, wa shall 
never find particles at the nodal points and very rarely near these points. 
If we are resigned to the undulatory theory of matter, the existence of 
these nodes is in itself no stranger than the existence of nodes in a standing 
wave system in the electromagnetic theory of light. But w'^e are in 
apparent difficulty if we ask how it happens that we sometimes find a 
particle on one side of a node and sometimes on the other— never right 
at the node itself. For if we think of the particles as vibrating back 
and forth, wc expect that unless they attain an infinite velocity in passing 
through thci nodes, they must spend some time in the neighborhood of 
each. 

. The paradox originates in an unw^arranted though inevitable carry- 
over of classical conceptions into the new mechanics. There is no way 
in which the motion of a particle having a definite energy can be followed 
experimentally, and hence we must infer that the classical conception 
of a definite space-time orbit is inapplicable. It is best to say that the 
particle has no definite position until its position is measured (c/. p. 76), 
When that is done the particle is. left in a state whose wave function 



Se(\ 201 


Tff£; PLANCK IDEAL LINEAR OHCILLA TOR 


87 


fonns a very compart wave pac^kot and is no longer in the original 
energ}^ level. We may perhaps think of position, momcmtunl, and energy 
as properties whi(di the particle assum(\s iindt^r tlie^ stress of appropriate 
experiments, it is not possible to measure position and energy 

exactly at the same' time, we shall not assume the existence of cither 
one unless ^ has a special form whicli defines the (luantity uniciuely. 
This is about the* same thing as saying that position and eiu^rgy exist 
only when there is no uncertainty n^garding their values.’ In any 
case it should be clear that the paradox ofvthe nodes involvc^s no question 
of disagn^einent Ix'tween theory and experiment but ratluT a disagree- 
ment betwe(m theory and metaphysical pn'concc'ptions. We rc^turn 
to these philosoi)hical cpiestions in S(‘c. *366, and Chap. IX. 

20. THE PLANCK IDEAL LINEAR OSCILLATOR 

20a. The Sommerfeld Polynomial Method. — TIk^ simi)lest exam])le 
of an energy-level problem whose exact solution has Ixnai worked out 
is that of the Planck ideal linear oscillator with the potential-energy 
function V = ^ corresponding waviMHpiation and boundary- 

value problem were w(^ll known t-o mathematicians before the advent of 
ciuantum mechanics.- 

To solve Kep (18*2) for this special case we shall employ the poly- 
nomial method of Sommerfeld*^ which is based on tlu^ facd. that in many 
cases each (igenfunction can be factored into the product of a polynomial 
P{x) and a transcendental or irrational algebraic fumdion Q{x). The 
roots of the polynomial give the nodes of the function, and its degree 
is therefore ecpial to the number of nodes. The factor Q{x) takes care 
of the behavior of th(^ function at th(' singular points which bound the 
region in which the tlifferential e(|uation is defined. 

For example, in the case under discuvssion the j)oints x — ± oo, where 
V becomes infinite art' singular points. To determiiK^ the* appropriate 
form for ()(,r) we first seek an aj)proximate solution of the differential 
('(luation valid only for large values of x-. The etiuation is conveniently 
writt(^n in the form 

+ (^ - = 0 ( 20 - 1 ) 

‘ To say that the matter partici(\s do not hav<* definite positions except under tht^se 
special conditions is really giving up our initial naive hypothesis that inatier ulti- 
mately consists of corpuscles. It would hardly be kjgitiinatc to stretcli the definition 
of a corpuscle to include an entity that had a definite position only a part of 1 he time, 
and an infinitesimal part of tlni time at that. We substitute the postulate that matter 
is an entity which exhibits the characteristics of corpuscles whenever it is subjected 
to observations designc'd to measure its position. 

® E. SoHH’iDiNOER, Ann. d. Physik (4) 79, 489 (1926); Coubant-Hilbebt, 
p. 283; A. Sommerfeld, Atombau und Spektrallinen^ W elhnrnechanischer Erganzu/ngb- 
band^ p. 2, Braunschweig, 1929. 



88 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS. [Chap. Ill 


with the notation 


= kE 




Kk _ 4w^fMk 


For large valuoH of .r^ the term in jS may be negleeted and we can immedi- 
ately write down the approximate solution 


\f/ = Q ~ c 


2 


In order to satisfy the boundary condition that ^(± oo) = 0, we must 
adopt the negative exponent. We next assume for the rigorous solution 
the form 


OlX‘^ 

x[/ = ve 2 . 


( 20 * 2 ) 


Substitution into Eq. (201) yields the diiferential equation 

g-2«4+(^-«> = 0 


(20*3) 


for V. This equation can b(^ solved by means of a power series in x. For 
convenience we first make the substitution 


{ = -v/aoJ, 

which yields 

Introducing the series expansion v = and equating the coefficient 

T 

of each power of J to zero we obtain the recurrence formula 

(r + 2 )(t + l)ar +2 ^ 2r^ar = 0. 

This formula shows that ao determines the coefficients of all the even 
powers of f while ai determines the coefficients of all the odd powers. 
ao and ai themselves are arbitrary and may be regarded as constants of 
integration. If we set ai equal to zero wc shall get a solution involving 
only even powers of x, and therefore as a whole an even function of x. 
If we set ao equal to zero, we get a solution which is an odd function of x. 
Since F is an even function of x and the boundary conditions are the 
same at the two ends of the x axis, it is evident that each of the eigen- 
functions must be either an oven or an odd function of o:.^ Hence we 
must expect 'that in every <^se either ao or ai will be zero. 

^ To give a rigorous proof, one may proceed as follows. I^et ^ *= f{x) be a solution 
of Fq. (20-1) which satisfies the boundary conditions. Make the change of variables 

Then Eq. (20.1) becomes 



Sue. 20] 


THE PLANCK IDEAL LINEAR OSCILLATOR 


89 


20b. Determination of the Eigenvalues. — For very large values of 
r we have approximately 

(^r+2 ^ 

Or r + 2 

This ratio occurs also in the series expansion of so that in general 
both the even and odd series become infinite as e^' when f becomes 
positively or negatively infinite. It follows that despite tln^ exponential 
decrease of Q with increasing values of the complete fuiudlon ^ usually 
becomes infinite as when becomes infinite. Exceptions occur for 
those special values of the energy which cause one series or the other to 
break off after a finite number of terms. Suppose, for example, that 

^ = 2 ft + 1, (20-5) 

a 

where n is any positive integer. Then the recurrence formula shows that 
an +2 is zero. If n is even, the series l)reaks off with the term and 
we get a solution by setting ch e(iual to z(u*o. Similarly we get a solution 
for odd values of n by sc^tting ao equal to zero. Inserting the values of 
a and jS into Eq. (20*5) w(^ obtain the eigenvalues of Ey 

En — (n -\- ^ i)hvcy ( 20 ‘ 6 ) 

where Vc is the classical vibration frequem^y {2Tr)~'^\/kl This is the 
same as the Bohr energy-level formula with half-integral quantum 
nuniliers. 

20c. The Eigenfunctions and Their Properties. — Combining the 
recurrence formula with Eq. (20*5) we obtain tlu^ following exi)licit 
expressions for the coefficients in th(' polynomials Vn{^): 

^ 2 )^ ^ 
ao r! 

^ = (-2)~^(^ , odd) 

ai r! 

The x)nS so defined are identi(*al to a constant factor with the well-known 

= 0. 

dx'‘ 

This is identical in form with the original eciuatioii, showing that /(—a;) is, like/(x), 
a solution of (20-1). As/( — a;) satisfies the boundary conditions, it is an eigenfunction 
with the same eigenvalue asf(x). Hence it can differ from f(x) at most by a constant 
factor. By the normalization condition this factor is ±1. f(x) is therefore either 
an even or an odd function of x. 



90 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


Hermitian polynomials //„(^) which may also be written in the form 

Hr, a) = ( 20 - 7 ) 

The eigenfunctions arc in turn identical with the Hermitian orthog- 
onal functions. The properties of these functions are quite fully 
described in various rnathomatical texts. ^ The most important are the 
following: 

//n+i - 2^Hn + 2nIIn - 0; (20-8) 

= 2n//„-,(J); (2()-9) 

/_V'' = 2“n!A/T«„,„. (20-10) 

In the last equation the Kronecker symbol b^n denotes that function of 
the integers m and n wlu(*h is unity when rn = n and is zero when rn ^ n. 
Equation (20*10) is a stat('ment of the (piadrati(* integrability of th(^ 
eigenfunctions and can be used to normaliz(‘ them as n^quired by Eq. (8*9). 
We write 

= cJ^'nU)e 

where Cn is a complex constant containing Uo or a-i, as the case may be. 
Then Cw is to have an absolute value such that 


J-.« VoiJ-oo Voc 


Solving for Cn and expressing a in terms of the vibrational frequency Vc we 
obtain the expression 







2hve 


( 20 ' 11 ) 


for the normalized ergenfuiictions of the problem. Xn is an arbitrary 
phase constant usually s(^t equal to zero. 

21. AN APPROXIMATION METHOD WHICH CORRELATES THE EIGEN- 
VALUES OF WAVE MECHANICS WITH THE ENERGY LEVELS OF THE 

BOHR THEORY 

21a. The B, W. K. Approximations for ^(x). — The possibility of finding 
the experimental energy levels of atoms from wave theory was first 
indicated by de Broglie^ who showed that the Bohr quantum condition 

1 Courant-Hilbbbt, M.JIf.P., pp. 77-79, 283; Ribmann- Weber, D.P., voL I, pp. 
342-347. 

*L. DB Brogue, Them^ Chap. Ill; J. de Phyaique 7, 327 (1926). C/., however, 
M. BriLLOUin, Comptes Rendus 168, 1318 (1919), 169, 48 (1919\ 171, 1000 (1920); 
de l^hyaique 8, 65 (1922). 



SBr. 21] 


AN APPROXIMATION METHOD 


91 


for cirtiular orbits in hydrogen is identical with tlie condition that the 
optical path longtli” around the orbit is an integral number of wave 
lengths. T^ater Schrodinger^ showed that in a number of important 
special cases the energy values given by appropriate solutions of his 
first wave equation are in substantial agreement with the Bohr theory 
and with experiment. It remained for Brillouin, Wentzel, and Kramers,^ 
however, to show why it is that the eigenvalue theory of wave mechanics 
is in such systematic agreement with the energy-level predictions of the 
Bohr quantum theory and to give an explanation of such discrepancies 
as do occur. Tlu'y have developed a powerful method of attack on 
eigenvalue problems in which the Sommerfeld phase-integral formulas 
of the Bohr theory drop oTit in first approximation. The method is 
based on a simple idea and is ndatively easy to apply, although a rigorous 
justification of the procedure {cf. Secs. 21c, d, c, and /) leads to rather 
tedious calculations. It is primarily adapted to problems which can be 
thrown into one-dimensional form by sei)aration of the varial)les. Even 
when this cannot l)e done rigorously, it is frequently possible? to introduce 
approximations whiedi make such separation i)ossible. We accordingly 
restrict the discussion here to the one-dimensional (?ase. 

Let us seek a fundamental i)air of solutions of the? Schrbdinger equa- 
tion (18*2) having the form** 

^ = (2M) 

where A is a constant. Direct substitution shows that y must satisfy 
the first-order Ricatti equation 

^.2/' + 2/^ = 2dE - F), (21-2) 

in which y' denotes the derivative dy/dx. Conversely every solution of 
(21-2) yields a corresponding solution of tin? wave equation (18*2). 

In order to solve (21*2) Brillouin and Wentzel independently suggested 
a development in power series in h. As quantum mechanics passes over 
into classical mechanics when h is set (?qual to zero, it is to be expected 
that early approximations of this tyt>e will prove most useful in the realm 
of high quantum numbers where classical theory is most nearly correct. 
Unfortunately an expansion of this kind is only semiconvergent and 
cannot yield an exact solution of the problem.^ On this account we 

1 E. SchrOdingbr, Ann. d. Physik (4) 79, 301, 489 (1926). 

*L. Brillouin, Comptes Rendns 183, 24 (1926), J. de Physique 7, 353 (1926); G. 
Wentzel, Zeits. f. Physik 38, 518 (1926); H. A. Kramers, Zeiis. f. Physik 39, 828 
(1926). The mathematical method of Brillouin, Wentzel, and Kramers seems to 
have been first used by H. Jeffreys, Proc. London Math, Soc. (2) 23, 428 (1923). 

* C/. the substitution (11*2) leading to the Hamilton- Jacobi equation (11*4). 

* ^ Cf, A. Zwaan, Arch. Neerland., ser. IIIA, tome XII, 1 (1929); G. D. Birkhofp, 
BvU. Am. Math. Soc. 39, 696 (1933); R, E. Lanoer, ibid. 40, 545 (1934). 



92 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


shall limit the discussion here to the approximation obtained by 
using the first two terms of an expansion of y in powers of h. What 
we lose thereby in generality will be offset by the fact that we can write 
down an explicit formula for tlie approximation under consideration 
in terms of 


Inserting the seric^s y = 


00 



A; 

yk into Eq. (21-2) and equating 


the coefficients of the two lowest powers of h/2iri to zero we obtain the 
following formulas for the first two terms of the expansion: 

yo = ±[2y(E - VW = ±p(x,E), ^ (21-3) 

"" ~2yo "" 4(I -T)' 

The two different signs of yo yield the following linearly independent 
approximate solutions of Vq. (18*2): 

fu{x,E) = Mx,E) = p(x,E)~^^ (21*5) 

w(x,E) = ^£p{^,E)dt. (21-6) 

These functions are usually referre^d to as B. W. K. or W. B. K. approxi- 
mations. The exponents are here written as definite integrals to avoid 
ambiguity regarding the constants of integration. We assume that the 
potential energy V{x) has the form indicated in Fig. 4 of Sec. 18, and 
identify the lower limit of integration in (2T6) with the point x/(E) 
which forms the left-hand boundary of the region of classical motion G 
for the energy in question (cf. Fig. 4). In th(^ region F to the left of x' 
the classical local momentum p{XyE) is imaginary, whereas infj it is real. 
At x' itself, p vanishes. 

In G the approximations fn and when multiplied by the time 

2‘niEt 

factor e * , yield progressive waves traveling to the right and left, 
respectively. If y is an arbitrary real number, the combination 

/ = «’■'/« + cos 


pdx + y 


) 


(21-7) 


represents in the same way a real oscillatory standing wave qualitatively 
similar in its behavior to the real integral curves ^p{x) of Eq. (18*2). 

In F the exponential factors of /« and fv are both real * and we have 
only to multiply the approximation functions by ( — 1)^ = e»v/4 

approximation is the one-dimensional form of that described by Eqs. 
(11.2), (11.5), and (117). 

* In order to avoid ambiguity due to the multiple-valued character of p{x,E) and 
v{x,E)^ a sign convention is needed. We shall have occasion to make use of complex > 



Sec. 21] AN APPROXIMATION METHOD 

order to convert them into real functions. We assume that 


93 


lim iw(x,E) = x I — E)\d^ = oo. 

It follows from the sign convention of the footnote that lim iw = oo, 

X — ►+ * 

Then, clearly, /„ becomes infinite at x = + co, whereas lim fv{x) = 0. 

X— » + 00 

At the point x' itself, both fu and /„ become infinite and in the neigh- 
borhood of this point they have no value as approximations to 
This same remark applies equally to the neighborhood of the point x" 
which bounds G on the right. 

21b. Application to Eigenvalue Problem. — In order to discuss the 
eigenvalue-eigenfunction problem with the aid of the above approxima- 
tions it is nec(issary that we shall be able to determine what particular 
linear combination of /,*, fv will fit an exact wave function \t^{x,E) in the 
regions G and H when the linear combination appropriate to F is known. 
As the approximations blow up near x' and .t", these neighborhoods form 
gaps to bo bridged. When appropriate connection formulas are estab- 
lished, it will be possible to use fu and f^ to evaluate the approximate 
energies at which the boundary conditions a,t x == ± oo can be fulfilled. 

Let us consider first the problem of bridging the gap at x = x\ 
We assume that for large negative values of x the linear combination 
oLufu + oLvfv describes a real integral curve \l/(x,E) to a satisfactory degree 
of approximation. We desire to determine a corrcsi)onding linear 


values of the argument x and hence require a branch of each function which is single- 
valued over the entir(' complex plane. The function F(x) may be assumed to be 
analytic over the entire plane. p(x,E) will then be analytic over a Riemann surface 
of two sheets with branch points at a;', a;", and at every other simple zero of V{x) — E. 
If the Riemann surface is cut along the axis of reals from x* through x" to -f- and 
"from every other branch point along a radial line leading to infinity, it will separate 
into two sheets over each one of which piXyE) is single-valued. The functions 
/„ fv have the same branch points as p. Hence similar cuts in th(i corresponding 
Riemann surfacjes yield single-valued branches of all the functions in w^hich we are 
interested. 

Let us now pick out that branch of p, say pi, which is positive imaginary along 
the uncut axis of reals in F, and that branch of p^, say (p^^)i, which makes 
a positive real in F. Wc identify w(XyE)y /«, and fv uniquely by postulating that in 
formulas (21-5) and (21*6) p and p^ are to be interpreted as pi and (p’^^)i, the path of 
integration never being permitted to cross any of the cuts. 

When applying thsse formulas to points on the axis of reals in G and H we shall 
always use those values appropriate to the upper lip of the cut unless the contrary is 
explicitly stated. 

It follows from these conventions that on the upper lip of the cut in f?, p and p^ 
are positive real. On the upper lip in H, p is negative imaginary and is a 

positive real. 



94 


ONE-DIMEN^^IONAL E.\ EUdY -LEVEL PHOBLEME [Chap. Ill 


combination fiufn + Mv which shall give a satisfactory description of 
the same integral curve xl/ioc^E) in that i)ortioii of G which is remote 
from x' and a?". 

This problem was attackcid and partially solved by Jeffreys (and 
independently by Kramers) using a linear approximation, say 

E ~ k{x - x'), 

for V{x) in the neighborhood of the point x\ Equation (18'2) is thereby 
reduced to a form whi(di can be integrated exactly, yielding Bessers 
functions of order for ^(.r). If the true potential function does not 
differ from the approximate one in the region when? /„ and fv are bad, 
the exact solutions for the modified i)olential can be used to bridge the 
gap and to give a description of in the neighborhood of r'. 

In this way it is possible to derive the Kramers connection formula 




t/x 


Region 1^’ 


2p~Y- cos 

I 


- i 

Region G 


(21-8) 


A proof of th(? relations (2^-8) and (21-9) will be giv(m in Sec. 21e. 
The sign — > indicate.‘s that if the left-hand member is fitted to in F, the 
right-hand member wnll be fitted to ^ in (/, but that the reverse is not 
necessarily true. A similar connection formula 

(21.0) 

I I 

Region H 


2p Ai cos 


2t r 
Tjr 


Pirn - 1 


Region G 


applies at the right-hand boundary. Putting these together we see that 
if \[/{x) is an eigenfunction of the differential equation (18-2), i.c., if it is a 
solution which vanishes exponentially at t})e boundary points, the 
approximate ex]:>rossions for \l/{x) in the good portion of the region G given 
by (21-8) and (2T9) should be equal. Accordingly 


2p~ 


cos 


"27r r 

.. hh 


Pirn - I 


2Cp~"Y^ cos 





Here C is a real constant which may be either positive or negative. 
The equation can be valid for a finite interval in f?, only if the arguments 
of the two cosine functions differ by an integral multiple of tt. Con- 
sequently the condition for an eigenvalue is that 




nTT, 


n ~ 0, 1, 2, 


or* 


' Equation (21 10) ia equivalent to the equation 

^(vo -f ^ ^ b 2, 3, • • • 

where Vq and yi are defined by (21 3) and (21*4). 



Skc. 21] 


AN APPROXIMATION METHOD 


95 


JiE) = fp(E,^)d^ s 2ffp(,E,^)d^ = («. + <i)h. n = 0,1,2, ■ ■ ■ 

( 2110 ) 

The left-hand member is the familiar Sornmerfeld phase integral taken 
over a complete classical oscillation and Eq. (21*10) is th(n*efore the 
Bohr theory quantum condition with 'Tialf-integrar' vibrational quantum 
numbers. In this approximation the wave theory and the Bohr theory 
have the same eigenvalues. 

In applying the above formula one evaluates th(^ integral J(E) with 
the aid of the explicit ex})ression for Vix) appropriate to the problem in 
hand and then solves (21-10) for E in terms of n. 

Convenient and simple as the formulas (21*8), (2T0), and (2T10) are, 
it is necessary to use them with care to avoid mistakf^s. This means 
that a careful consideration of th(*ir d(*rivation and of th(‘ errors involved 
is important. To this task we now address ourselv(\s, using, however, 
in place of the Jeffreys- Kramers mc'thod a schinne of attack first applied 
to this probhmi l)y Zwaaid and more fully developed by the author.- 

*21c. Zwaan’s Method and the Stokes Phenomenon. — Zwaan 
establishes a connec^tion between the regions to the left and right of x' 
by introducing complex values of the independent variable and passing 
around x' in the complex x plane. Any linear combination of fu and /« 
conforms to the differential ecpiation 


r + 




/ = 0 , 


( 21 * 11 ) 


where 




( 21 * 12 ) 


1 A. Zwaan, loc. dt.j and ThesiSj “Int.cn,sitatori irn Ca-Funkenspektrum,’^ Utrecht, 
1929. The discussion by Zwaan (a pupil of Kramers) is an adaptation to the problem 
under consideration of the previous work of Sir George Stokes. Cf. G. Stokes, 
Math, and Phys. Papers^ Vol. IV, pp. 77-109, 283-298. 

Still another powerful procedure is due to Langer. It is based on the introduction 
of modified approximate solutions of (18.2) having the form 


{LTiT' 




My 


in which J±^M denotes a Bessel’s function of order ±}4 in the argument w{XyE) 
defined by Eq. (21-6). This type of approximation degenerates into a B.W.K.-type 
function for large values of tc, but remains finite when w vanishes. Cf, R. E. liANOER, 
Tram. Am. Math. Sac. 38, 23 (1931), 34, 447 (1932); Bull. Am. Math. Soc. 40, 545 
(1934); Phys. Rev. 64, 669 (1937). 

* E. C, Kemble, Phys. Rev. 48, 549 (1935). 



96 ONE-DlMENfSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


Comparing this with Eq. (18-2) whicii can be rewritten in the form 
r + = 0, (21-13) 


we see that if Q is very small in comparison with the equations 

are practically the same. Thus /is to be regarded as a good approximate 




Q{x) 

\p{x)^ 


« 1. 


solution of (21 T3) in the neighborhood of a point x if 

If it is possible to find a path T in, say, the uppe^r half of the complex 
plane, joining real points to the left and right of x', crossing none of the 
cuts mentioned in footnote 2, p. 92, and having the property that 


^2 

4t^\ 


p(x^ 


« 1 


along its entire length, we might be led to expect that the approximation / 
would cling to the corresponding exact function \f/ along the entire path. 
On this hypothesis the coefficients would be equal to au and 

respectively. If — F has a large enough maximum at the potential- 
energy minimum^ and if the function F(;r) is smooth enough in the 
neighborhood of a: = a path V satisfying the above inequality ctm 
be found. On the other hand, the infeiencn^ that / must therefore cling 
to \l/ and that and must be equal to and respectively, is (cer- 
tainly false, for it follows from the assumed reality of \l/(x) on the real 

axis that jS,, = whereas a^e ^ are r(^al. Thus^the reality 

condition is incompatible with the assumption that the same linear 
combination of fu and fv can represent a real integral curve on both 
sid(?s of x'. The difficulty is of courses connected with the fact that 
\l/(x) is analytic at x', while /„ and fv have brancii points there. 

We conclude that if fi^.x) denotc^s that linear combination of /„(x) 
and fv{x) which fits \l/{x) best at the point the coefficients a„, a*, in tlie 
identity 

/(^,x) - aumuix) + av{^)fv{x) (2M4) 

must change as J moves around the curve The essential feature of 
the situation to which Zwaan calls attcnition is the fact that along a 

iln the special case where V(x) = ^kx^ (harmonic oscillator), the value of 
minimuni point x = 0 is where i/ is the classical vibration 

frequency. The value of at the mid~point of G for the fifth energy level 

[^*( 44 * M)hy] works out to be He 2 - 

* This phenomenon is characteristic of approximations based on asymptotic 
series and is called the Stokes phenomenon. Cf. literature cited in fo()tnote 4 
p. 91. 



Sec. 21] 


AN APPROXIMATION METHOD 


97 


portion of the path T the function becomes very small, while /t, becomes 
very large. Hence and fu are said to be dominant and subdominant, 
respectively, in the region in question.^ Thus there is no real con- 
tradiction between the statement that a fixed linear combination of fa 
and fv is a good approximate solution of (2T13) in the neighborhood 
in question, and the paradoxical requirement that if aufu + avfv is to be 
an exact solution of (21T3) the coefficient must change markedly 
in the neighborhood in question. 

Zwaan assumes that the coefficient of the subdominant term, i.e,, 
auj alone can changes on T. Tlnui we must have = a,, and the reality 
condition fix(\s fiu as equal to at,*. This result leads at once to the con- 
nection formula (21-8) as shown in Sec. 21c below. 

*21d. Analysis of the Stokes Phenomenon. — Zwaan\s argument is 
fundamentally sound, but it is not perfectly rigorous and gives little 
idea of the possible errors involved in the resulting formulas. Hence we 
shall here undertake a somewhat more detailed analysis of the situation. 

The first step in such an analysis must be tlTe laying down of unique 
definitions of a^ and a„. We assume that the linear combination 

O-uf u Urfv 

is to be fitted to an exact solution of (21T3), say The fit is to be 

rigorous except for the isolated singularities of fu and fv. Then 

au{x)fu{x) + av(x)fv{x) = \p{x), (2M5) 

This equation is insufficient, however, to fix the values of both coefficient s 
and must be supplemented by another. As a second relation we impose 
the requirement that 

au{x)fj{x) + av{x)ff{x) = ^'(x). (2M6) 

Equations (21-15) and (2M6) are evidently the equations we should 
have to employ if we were fitting a fixed lint'ar combination to ^ at the 
point X and wished the combination to cling to ^ as closely as possible 
in the neighborhood of x. The functions and a*, thus defined reduce 
to au and av, respectively, at one end of the path r and to and jSv, 
respectively, at the other. As the so-called Wronskian determinant 

/«// - Ufv has the constant value — it is always possible to solve 

Eqs. (21-16) and (21-16) for a„ and a.. We obtain 

a« = ^(^// - m-, ^ av = - m- (2M7) 

Differential equations for a„ and o„ are easily set up. Thus 

^ - gw." - *'%)■ 


1 Cf. Appendix D, Part I. 



ONE-DIMENSION Ah ENERGY-LEVEL PROBLEMS [Chap. HI 


rs 


On reduction with thf^ aid of (2111) and (21*13), 

^ J (21-18) 

dx 47r *' 4irp 

Similarly, 

(21-19) 

dx At *' 47r p 

Introducing the fixed point f, let us make tlu^ transformation 
suggested to the author by Dr. Eugene Feonberg. H(Te F(a;,f) is defined 


by the equation F(a:,f) 



We obtain the simplified equations 


dbu 

dx 


^ ^ 2i( f4-M-) 

Air p 


K 


dx Aw p 


It is convenient at this point to int roduce a quantity pv wtiicli w(‘ shall 
call, the index of quality of a path F, and which we define as a line integral 
along r. Thus, let 


Mr 



( 21 * 20 ) 


wheTe ds == \dx\ denotes an element of arc. We further define a good 
path T as one which satisfies the condition /zr 1. For points wliicli 

can be connected by such a path |F(x,f)| <$C 1. Under these circum- 
stances Uu, av are sensibly equal to 6**, 5^, respectively. Hence we (^an 
write 


da^t ih Q da^ 

dx Aw p dx 


ih Q 
-e 
Aw p 


2iWf, 


( 21 - 21 ) 


in diKousrting the behavior of o„ and o„ along a path which is good and 
can be connected with f by an extension which is also good. 

Let us apply the term Stokes region to a portion of the complex x plane 

are very small and throughout which |e“| is 

either very large or very small. If le*“| is very large, |/„| « |/.| and/„ is 
dominant. If |e*"l is very small, is dominant. We call these portions 
of the complex plane Stokes regions because, as we shall see, they are the 
domains in which the Stokes phenomenon of the shifting of the coefE- 
dents takes place. (Changes in these coeflScients also occur near the 


A 

9 

and — 


2ir 

p\ 

^ 47r* 





Fia. 6. — ^Levol lines of the function and Stokes regions for the harmonic oscillator 

problem. 

Our problem is to determine tii(‘ (*oeffieients l3v in terms of a,*, 
by means of an approximate integration of P]qs. (21*21) along a suitably 
chosen good path T, To this end it is convenient to consider, separately, 
several different types of good path, 

First of all, let us deal with a path which lies wholly in one of the 
^‘neutral zones between the different Stokes regions. On such a path, 
is of the order of unity and it follows from Eqs. (21*21) that and 
are sensibly constant. 

Consider next a Stokes region M in which is dominant. According 
to the above equations daufdx is very small in M, and we infer that 
can bo treated as sensibly constant along any good path .F confined wholly 
to this region. On the other hand, is evidently far from constant in 
M except in the special case where a^ix) is zero or nearly zero at the 
initial point of the path. In that case au{x) must be nearly equal to 
zero along the entire length of the path and the second of Eqs. (21*21) 
indicates that may bo sensibly constant. 

In the case of a path located in a Stokes region in which is dominant 
the behavior of the coefficients and is reversed. Thus our cpnclusion 
is that each coefficient changes 07ily in Stokes regions in which the correspond^ 
ing function fu orfv^ as the case may he, is subdominant. Even the coefficient 


100 


ONE-^DIMENSilONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


of the suhdo7nma7it term will be sensibly amstard along a good pathy however , 
if the coefficient of the dominant term is zero y or sufficiently small. 

In order to put these statements on a rigorous basis and determine 
their limits of error, the writer has made a more exact analysis of the 
situation.^ Consider first the variation in along a path F leading 
from the initial point Xo to the final point X\ and so drawn that 
increases steadily as we move from the initial poird toward the final point. 
In this case the inequality 

|a.(xi) - a.(xo)| < MnvlKl + (21-22) 

can be established. If the path starts in a neutral zone where is 

nearly equal to unity, we have 

|a«(xi) - a,,(xo)| < 52 MrlKI + (21-23) 

The variation in along a similar good path is limited by the inequality 
|a„(a;i) - a.,(xo)| < (21-24) 

where, however, Xq is now taken to be at the uphill v7id of the path where fu is 
dominard. The inequalities (21-22) and (21*24) readily confirm the 
italicized statements of the preceding paragraph, provid(Hi that we under- 
stand that a coefficient is to be regarded as sensibly constant along 
a path when its change is small compared with the larg(\st of the four 
quantities |a«(xo)|, |av(xo)|, \an(xi)\, |a„(xi)|. 

*21e. Derivation of the Connection Formulas. — Having examined the 
general problem of the variation in the a^s, we return to the problem of 
the connection formulas. We desire to determine the coefficients 
0u,0v for the good portion of the axis of reals in the region G in terms of 
the coefficients ay, a,, in F. To make the problem quite definite we choose 
a pair of definite points, say Pi in F and P 2 in G, at which the two pairs 
of coefficients are to be evaluated. In order to set up the desired for- 
mulas we must assume the existence of a path F joining Pi with P 2 
through the upper half plane and having the property that Mr 1. 
This path must cross none of the cuts on which /i: and/v are discontinuous 
and hence must enclose no zeros of E — V{x) between itself and the 
portion of the axis of reals between Pi and P 2 . It will necessarily pass 
through the Stokes region C in which fv is dominant, as well as through 
part of the Stokes region A in which it originates. 

Consider first the case of a ^ function which vanishes at x = — 00 . 
Then au = au(P 2 ) = 0 and both coefficients are constant in ^4. Uv 
remains constant along that portion of F which traverses C and we 
conclude that fiv - On the other hand, the coefficient a„ is not 
constant in C and we should have to perform a difficult task in integration 
in order to dete|:mine from the differential equation for this coefficient. 
Fortunately the integration is unnecessary as we can fall back on the 
1 C/. footnote 2, p. 96i 



Sec. 21] 


AN APPROXIMATION METHOD 


101 


reality condition used on p, 96. Giving the value in order to 
make ^ real in F, it follows that 

Pu = Pv* = = e 

Thus the connection formula (21*8) is established. 

As stated on p. 94, we draw the arrow in (2T8) from left to right to 
indicate that the approximate validity of the left-hand member implies 
that of the right, whereas the converse statement is not true. In order to 
make clear the reason for this ‘‘one-way street sign” and prepare the 
way for the derivation of the second connection formula, we observe 
that in view of the homogeneous linear character of Eqs. (21 T8) and 
(21*19), the coefficients fiu and fiv must be homogeneous linear functions 
of au and a„. Thus a complete exact solution of the whole connection 
problem would involve a complete exact determination of the matrix 
llfifll of the equations 

~ QuuOCu *4” QupOLv ] 

~ QvuOLu “i“ QvvOLv 

The approximate solution of Eqs. (21*18) and (21*19) for the special case 
that au = 0 shows that 

Qvv ~ 1 “1“ Ouv ~ f(l "i* ^*)j 

where d is a small quantity of the order of magnitude of which we have 
neglected in Eq. (21*8). 

Additional information regarding the matrix \\g\\ is obtainable from the 
fact that the Wronskian of any two exact solutions of the Schrodinger 
equation, say ^ and is constant along the axis of reals. Thus, if 

^ = Olufu + OLvfvy ^ = dufu + avfv 

in F, it follows that, in F, 

== ^{a„du — audv) = constant. 

The Wronskian of the same pair of solutions in G takes the form 

TP ~ ""j^i^avdu auav^i^QvvQuu ““ OuvOvu^ 

Equating these two expressions for the Wronskian, we see that the deter- 
minant of jigfll is unity. Hence the inverse equations to (21*25) are 

, ^tt ~ QtlV^V} 

+ guuPv* 





102 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


It follows from the reality condition that if fiu = the quantities 
are real. Hen<*e 

Qvu ~ (Juv ~ • (21 27) 

Combining the last relation witfi the requinmient that the determinant of 
ll^ll be equal to unity, we obtain the following relation between guu and 6; 

+ 5) + gfuM*(l + 5*) = 1. (21-28) 

If small quantities of the order of Mr bo neglected in comparison with 
unity, one can conclude that the real part of guu is ] 2 . The imaginary 
part cannot be determined even approximately without an ex])licit 
evaluation of a difficult integral, but it is possible with the aid of (21-24) to 
set up the useful upper bound 

|!y«„l = \gvu\ < (21-29) 

Let us now consider the validity of the inverse of the relation (21-8) 
We know from the derivation that if the left-hand member fits the func- 
tion \l/(z) exactly at Pi the right-hand member fits the same function 
approximately in the neighborhood of P2- The inverse relation would 
be the statement that if the right-hand member fits il'iz) exactly at P2, 
the left-hand member fits \l/{z) aj)proximately in the neighborhood of 
P). With the aid of our partial determination of the matrix \\g\\ we can 
test the validity of this statement. We accordingly assume, as in the 
right-hand member of (21-8), that ft* = ifiv = It follows that 

the corresponding values of a« and are not and 0, but 

^i./4[l - 5)] 

— 

and (5 — d*)e ^ . Although the correction to «« is small, its product by 
the dominant approximation function /„ is not necessarily small. Hence 
the left-hand member of (21-8) may be entirely incorrect when the right- 
hand member fits ^(2) exactly at P2. 

A second connection formula is a direct consequence of (21*26) and our 
information regarding |l(7l]. Consider an exact solution of (18*2) which 
has the form 

^(x) = 2p~'^ cos {w + y) = e'^yfu + 

in the neighborhood of P2. By (21-26) the corresponding values of 
au and av are, if we neglect small quantities of the order of gr, 

au ®= 2e^ cos ^7 — = c^j^2X cos ^7 — sin ^7 — j- 

Here the symbol X stands for the unknown, but bounded, imaginary 
part of guu- It is not difficult to show from these formulas and the 



Sec. 21] 


AN APPROXIMATION METHOD 


103 


inequality (21*29) that if Mr 1 and 


tan 


- 1)1 


<$C the 


product a4v i« small compared with auU in the neighborhood of Pj. 
Thus we obtain the connection formula 

cos ^7 — ^'i*-** ^ cos (21*30) 


Region F Region G 

This is usually specialized by setting y equal to 7r/4 in which case the 


condition laid upon tan 




is automatically fulfilled for all positions 


of Pi. The possibility of the abov(‘ gtuieralization to values of y difft^rent 
from 7r/4 has been indicated by Langer.^ 

In addition to the resti'iction on 7, the sole condition for the validity 
of the comiection formulas (21*8) and (21*30) is that there shall exist a path F 
in the upper half plane connecting Pi with P2, enclosing no complex zeros 
of E — V{x), and having the property that 

The relations (21*8) and (21*80) apply to the left-hand boundary of 
the classical region G, wluu’e V{x) has a m^gative slope. The correspond- 
ing relations for the right-hand boundary x" are (21*9) and 


cos 




pdi + 7 


IpI He 


;o,s ^7 - (21-31) 


R(^gion G 


Region H 


The last formula applies to a pair of points P3 in Q and P4 in //, provided 
that they can be connected by the usual good path F and that 


tan 






*21f. Derivatioii of the Sommerfeld Phase -integral Quantum Con- 
dition. — The Sommerfeld phase-integral quantum condition can be 
derived with the aid of the connection formulas (21-8) and (21-9). Wc 
assume that Eq. (21-13) is to be solved in the infinite interval 

— 00 < X < + 00 

with the boundary conditions lim 4/(x) - 0, the potential function being 

ar— * ± 00 

everywhere differentiable and greater than the energy E except in the 
single region of classical motion 0. It follows that a„ must vanish at 

ih 

X = + 00. By (21-18) ~ 15^ I Qfvpdx with the integration carried 

out along the axis of reals. 

^Loc. oil., footnote 1, p. 96. 



104 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


Obviously, then, will not be exac^tly zero and wo must make sure that it is small 
enough to make (21-8) applicable. Kcpiacing x by w as the variable of integration 
we obtain the upper bound 


1 /•«»(/*.) I Qh^ 


If. 


2jv>{- oo)|47rV)^i| 

Qh^ 


^'^\yp\\dw\. 


Let N denote the maximum value of range of integration. Since |^(a;)| 

decreases moiiotonically as we pass from P\ out to — oo along the axis of reals (c/. 
Appendix C), we may replace it by |^(Pi)|. Thus 


Kl < I c-''‘ d\w\ 

^ JwU*x) 

< (21.32) 


If is a good approximate solution (o (21 13) throughout the range of integration, 
N /\p{Pi)yi\ will be much less than unity and 


W« Wc-2-(/V. 

Let 6/3u and 5/3„ denote th(i (*rrors in jfiu and du(^ to sea ting «« — 0, as in th(^ h^ft- 
hand member of (21-8). Then 


\^^u\ ■■= |.^um||«u|; ^ |f/r«||Q!«|. 

The inequality (21*29) gives an upper bound on |(7aM| and while the ‘^anper- 
turbed” values of |/^u| and \(iv\ are both equal to lai| if we neglect small quantities of 
the order of Hence 


W |j/}J iV^r 


These inequalitic^s show that the error in using (21*8) for the determination of the 
form of an eigenfunction in the classical domain G is nc^gligible provided that /*p and 
J\r/2|p(Pi)^^| are small compared with unity. Of course (21*9) stands on the same 
footing in this respect as (21*8). 


Let US now assume that the energy level under consideration is high 




« 1 . 


enough so that there is a portion of the region G in which 

If there are no complex roots of •— F (x) near the interval G of the axis 
of reals, we can assume the existence of good paths T and F', connecting 
G with a pair of points Pi in P, and Pi in P/, respectively. Equations 
(21*8) and (21*9) give two different approximate expressions for ^(x) in 
the good portion of G. Equating these expressions, one obtains 


2p-w cos - i] 2Cp-« cos + j] 

given in Sec. 216. This equation leads at once to the Sommerfeld 
formula (21-10). 



Sec. 21] 


AN APPROXIMATION METHOD 


106 


The energy levels given by (21-10) for the ideal linear oscillator are 
identical with those derived by rigorous methods in Sec. 20. This is 
somewhat surprising inasmuch as the conditions for the validity of the 
connection formulas are not fulfilled at the lower energy levels. In the 


case of the lowest level, for example, the value of 


at the mid-point 


of G is 3^ and 


']ds has a minimum value of about for any path 


V# yr ^ 2 J j XM XX.XJlXXX M M XKXX XX V/J. ^ ^ X\IX L-VAJ-JT 

leading from the point in question to the Stokes regions B, A, and C 
(Fig. 6). (k)nsequently we are led to inquire wludher there is not some 
general princiyde which will permit us to establish Eq. (21*10) inde- 
pendently of the connection formulas (21-8) and (21-9). The answer is 
that such a principle does exist. We state only the gcmcral result in this 
section, referring the Header to Part II, Appendix' D, for the proof. ^ 

Let the basic problem under consideration be the same as the one 
discnissed above, viz.^ that of finding the discrete energy levels for a one- 
dimensional anharinonic oscillator with the fundamental interval 
— oG < 0^ < +00, an analytic potential function, and a singles potential 
valley. We assume that for the energy or energies under discussion 
h - 10 

r- is small throughout the intervals F and // except in the immediate 

neighborhood of the end points x' and x". Instead of ))ostulating good 
paths r and I"' connecting G with F and //, however, we postulate a single 
path r connecting a pair of points Pi in F and Pa in II without enclosing 
any complex zeros of E — F(a;), and having the propcTty that Mr <$C 1. 
We further assume that W/|p(Pi)^2| is small compared with unity — a 
parallel condition for the point Pa would be equally good. Under 
these conditions Eq. (21*10) can he shonm to hold even for the low eigenvalues 
for which no good paths connecting G with F and H can he set up. 

The only doubtful point likely to arise in the application of this 
residt to specific problems is the question whether the good path V can 
actually be set up. In order to test this point qualitatively one can 
replace E — V{x) by an approximating polynomial of degree fc, 


T{x) = const. X n (a; — %). 


If this approximation is good throughout the interval between suitably 
chosen points Pi and P4 we can compute ® and ~ along a path con- 
necting Pi with Pa from T{x). The condition that a path T shall be 

' The work of Part II, Appendix D, leads to essentially the same result as that 
obtained by Birkhofi {loc. dt.y footnote 4, p. 91) by entirely different methods. 



106 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


/'good^^ is then readily seen to be that it shall keep away from all the 
complex roots of T{x). 

The writ(T has carried through a test computation of the accuracy of 
(21T0) using the Morse potential function for the normal state of H 2 
molecule. This function has the form 

V = 

where x is the displacement from the equilibrium position. Let f = c"®* 
Then 

E - V === E + 2 Df - 
The zeros of this function are 



X = -ij^log.^1 ± ^1 + + hrm , m = 0, ±1, ±2, • • • 

Thus we obtain two sequences of zeros in the complex x plane ecpially 
spaced along lines parallel to the imaginary axis and passing through 
x' and a?". The (computation is simplified by choosing E equal to —D so 
as to bring the points and x" together. This should not materially 
affect the value of for a path that cuts midway between the nod(^s x\ 
x" and the next pair above them. Taking F to be a circle of radius r/a 
with its center at the origin, and inserting the vahn^s of a, D, and the mass 
coefficient appropriate to the normal state of H 2 , we obtain = 0.015. 
The quantity iV/|p(Pi)F^| of (2T32) is so small that ernms due to the 
finite values of at Pi and P 4 an? negligible in comparison with those 
due to fjLy. Let HE denote the maximum error involved in using (21 TO) 
and let AE denote the energy-level spacing for adjacent values of the 

8E 

quantum number n.' Then r— can ])e worked out and proves to be 

AIL 

sensibly equal to Thus we can trust the Sommerfeld formula in 

this case to about three-tenths of a per cent of the spacing of adjacent 
energy levels. 

Of course the actual error involved in (21 TO) should be appreciably 

less than the computed upper bound. Moreover, since ^ varies 

inversely as the square root of the effective mass, the above figure would 
be materially improved in the case of a heavier molecule. In the cavse 
of a harmonic oscillator (quadratic potential function) there are no roots 
of jB — y(x) except those at x' and x". Hence we can choose F as a 
semicircle of indefinitely large radius and thus r€?duce the maximum 
error of the Sommerfeld formula below any assignable limit. 



Sec. 21] 


AN APPROXIMATION METHOD 


107 


*21g. Higher Approxiinations. — A better approximation to the energy 
eigenvalue cjan be obtained in most cases by using higher order terms in 
the series expansion of the function y of Eq. (21 -2) . ^ If wc take an n-term 
approximation 

n 

k^O ^ 


where n is not too large, the differential equation for the function c'^J 
will usually be a better approximation to (18-2) along V than th(^ approxi- 
mations we have been using. It follows from the argum(*nt of Part II, 
Appendix D, that the above function must have the same values on the 
upi)er and lower lips of the cut in //. Thus we can replace Eq. (21 TO) 
by the more accurate energy level condition 

= mh. m == 0 , 1, 2, * • • (21*34) 

As previously stated, however, the series obtained l)y letting n become 
infinite is only semi eon vc^rgent. Hence wo cannot improve our result 
indefinitely by making n v^ery large. 

*21h. Modification of Method for Radial Motion in Two-particle 
Problem.^-'- In applying the B. W. K. method to the radial equation of the 
two-particle problem [Eq. (28*19), p. 150], one meets with an apparent 
difficulty in that the fundamental region in which the equation to be 
solved ranges from 0 to 00 instead of from — 00 to + co. Moreover, the 
B. W. K. approximations do not have the right character to fit the exact 
solution of the differential equation at the left-hand boundary point 
r = 0.® Thus there is no good path T leading from one boundary to the 
other because the left boundary is itself “bad.” As pointed out by 
Krame s,^ however, it is possible to fit the boundary conditions at both 
ends of the fundamental region if we modify the approximation formulas 
by the addition of th(^ term h^/S2w^fjLr^ to the potential energy. The 
added term is negligible except in the immediate neighborhood of the 
origin, and it is ordinarily possible to find a path T leading from r = 0 
to r = 00 along which the modified B. W. K. functions are good approxi- 

iC/. J. L. Dunham, Phys. Rev. 41, 713-720, 721-731 (1932). 

* This section amplifies the brief treatment of the same subject previously given 
by the author in the reference of hmtnote 2, p. 95. An article by R. E. Langer 
in the current number of the Physical Revieiv (Apr. 15, 1937) contains an independent 
discussion of the same problem which overlooks the earlier work of the present writer. 

3 Consider, for example, Eq. (28- 19) when I - 0 and V(r) has a pole of the first 
order at the origin. In this case both of the B. W. K. approximations /„, /,. vanish 
at the origin, whereas no exact solution of the differential equation can have a zero 
at that point. 

^Loc. cit.^ footnote 2, p. 91. 



108 


ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS IChap. Ill 


mate solutions of (18*2). Under those circumstaiKjes Eq. (21*10) gives 
the energy levels if the monientiini p(r,E) is (iompiited from the modified 
potential function [r/. Sec. 2Sg, Eq. (28*23)J. 

Let us assume that V (r) has the form 

V(r) = -j + ^ -f <p{r), 0 < r < 00 

where <p(r) is analytic along the entire positive real axis and at the origin. Let the 
modified potential function be 

The corresponding modified momentum is 

p<”‘) = [2 m(B - 4>)]^. 

Let the modified B. W. K. approximations and be computed from 
just as/u and/v are computed from p. Then the modified approximations are readily 
proved to be solutions of tlui equation 



in which is defined by 



Comparing the above differential equation with (21*11), we see that plays the part 
formerly played by Q in determining the quality of the modified approximations and 
in determining the variation in the coefficients when a linear combination of 
and is fitted to an exact wave function ^(r). Whereas Q becomes infinite as 
l/r^ at the origin, becomes infinite there only as 1/r. remains finite at 

the origin and vanishes there. 

Thus one is led to hoi)e for a good path, leading from the origin to a point on the 
axis of reals beyond the region (?, along which the coefficient a,, is sensilJy constant. 
Investigation shows that, if <p(r) is well behaved, there is no difficulty in finding a path 
of this kind. The validity of Eq. (21*10) when used with the modified mom (in turn 
formula is a direct consequence of the existence of the good path. 

21i. Asymptotic Agreement of Wave Theory and Classical Theory 
Regarding Position of Particle. — It is interesting to note that in the case 
of a state of high quantum number the Bohr theory gives us not only the 
correct energy levels but also the correct value of the probability of 
finding the particle on any segment Ax of the x axis, provided that Ax 
is large compared with the de Broglie wave length X = h/p(XjE) or is 
made equal to an integral number of half wave lengths.^ In computing 
this probability on the basis of the semiclassical Bohr theory, one must 
assume that the phase of the motion is unknown. The probability in 

1 Cf. J. H. Van Vlbck, J, Franklin Inst. 207, 475 (1929); Proc, Nat Acad. Sd. 14, 
178 (1928). 



Sec. 21] 


AN APPROXIMATION METHOD 


109 


question ^AF is then equal to the ratio of the time occupied in crossing 
the segment Ax to the half period of vibration. Denoting the period 
by T and the classical velocity by v, we have 


2 df _ 2 m di 

tJ. v(i,E) tX 


Computing from the B. W. K. approximation 
\l^(XyE) — cos [w{XyE) + 7 ], 

we obtain 

dF = ^ cos^ (w + 7 ) 


(21-35) 


for the probability gf an infinitesimal interval dx in the region of real 
classical momentum. To eliminate the fluctuations due to the nodes of 
we replace the phase factor cos^ (w + 7 ) by its average value in computing 
the probability of a macroscopic segment Ax. Thus 



z+Ax 


pHTeY 


(21-36) 


As the probability of finding a particle in the region of imaginary classical 
momentum becomes negligible for large energies, ^ can be normalized by 
integrating over the region G alone. In this way one obtains 


A 27T 

1=1 xl4*dx = (21-37) 

Combining (21-37) with (21-36), one obtains the Bohr equation (21-35). 

21j. The Transmission of Progressive Matter Waves through a 
Potential Hill. — A very important aspect of quantum mechanics is that 
it permits particles of energy E to penetrate a potential hill where E — V 
is locally negative. We are already familiar with this penetration as 
exhibited by the fact that in the one-dimensional oscillator problem of 
Secs. 18 and 19 the wave function ^ does not reduce to zero at the classical 
turning points x' and x". Another consequence of the continuity of the 
Schrodinger equation through regions of negative kinetic energy is that 
particles can actually pass through barriers that would be classically 
impenetrable and come out on the other side. This phenomenon is 
sometimes called the ‘Hunnel effect^' and forms the basis of recent 
theories of radioactive decay (c/. Sec. 31), of the extraction of electrons 
from cold metals by intense electric fields,^ of the predissociation of 
molecules,^ etc. It could be treated by means of wave packets which on 

^ Cf. R. H. Fowleb and L. Noedheim, Proc. Roy, Soc, 119, 173 (1928). 

2 Cf, O, K. Rice, Phya, Rev, 34, 1451 (1929), 86, 1538 (1930). 



110 


ONE-DJ MEN SIGNAL ENERGY-LEVEL PROBLEMS [Chap. Ill 


striking the potential barrier are split into two parts, one reflected and 
the other transmitted through the barrier. A simpler procedure, how- 
ever, is to study the reflection of a steady stream of particles incident on a 
potential barrier from one side, using the method of nonquadratically 
integrable wave functions mentioned in Sec. 8. This is the method 
almost universally employed for the study of reflection and scattering 
phenomena. 

The B. W. K. approximation functions and the Kramers connection 
formulas have been much used for the approximate evaluation of the 
transmission and reflection coefficients for the tunnel effect. The 



Fig. 7. — The transmission of matter waves through a potential hill. The curve marked 
y =a £! 4- R.P. indicates the real part of the wave function for waves of energy E 
incident on the hill from the left. 

approximation functions are particularly adapted to the case of a parabolic 
potential such as that indicated in Fig. 7. We accordingly assume the 
potential function Fo — and choose the zero level of energy as Fo. 
The expression for the square of the classical local momentum becomes 

p* = 2,x{E - F) = ixk{x'^ + 

E is assumed to be negative. It is convenient to modify the sign con- 
vention of footnote 2, p, 92, and make p negative real and ip"^^ 
positive real to the left of the classical turning point x\ of Fig. 7. Both 
p and are then positive real to the right of X 2 (c/. Appendix D, 
Part III). With these conventions fu represents an outgoing steady 
current of particles on both sides of the hill and fv a similar incoming 
current. 

To prove this last statement we introduce the mass current density 

I defined for particles moving in three dimensions by Eq. (8*5), Chap. I. 
Setting the vector potential of Eq. (8-5) equal to zero, we obtain the 
expression 



for the eqmponent of I in the positive direction of the x axis in our one- 



AN APPROXIMATION METHOD 


Sec. 21 ] 


111 


dimensional problem. The values of / for and fv on the axis of reals 
are 


Ilfu] 


(P + 

2 \p\ 


m = - 


(£_+ P*) 

2N 


In A the functions and /„ give real currents of unit magnitude directed 
along the negative and positive axes, respectively. Denoting the real 


integral ^ 
n 



l)y Ky we find that the current in the region C to the 


right of X2 is for fu and — for/„. This verifies the statement at 
the end of the preceding paragraph. 

An examination of the Stokes regions given in Part III, Appendix D, 
shows that if there is a good path F passing around the turning points X] 
and X2 from ^ to C without enclosing other roots of E — V{x)y we can 
establish the existence of approximate connection formulas of the 
character 


fu “ 1 “ V ^ fu y 

fu ^ fu ” 1 ” ^fv' 
I I I I 

A C 


(21*38) 

(21*39) 


Here /li, fv denote the values of /« and fv appropriate to the lower edge of 
the cut along the portion C of the axis of reals. The upper connection 
formula is applicable to the reflection of particles incident on the hill 
from the left, while the lower one applies to a stream of particles incident 
on the hill from the right. 

In the region A to the left of Xi we have/„ = — It follows that if 
a and jS are any constants, 

I[afu + m = /[«/«] + IWr] = 101^^ ~ Ial^. 


Thus the approximate wave function /« + cfv represents the superposition 
of an incident stream of magnitude \c\^ and a reflected stream of unit 
magnitude. The corresponding transmitted stream in C has the mag- 
nitude The net current on the two sides of the hill must be the 

same, as one can prove from the constancy of the Wronskian of any 
exact wave function ^ and its conjugate Hence 


Ids ~ 1 = e-2^. (21-40) 

The corresponding transmission coefficient for an energy which is less 
than is 


e = 


Id* - 1 




1 

1 + 


(21-41) 


A complete determination of the complex constant c would determine 
the phase of the incident wave with respect to the transmitted and 



112 ONE-DIMENSIONAL ENERGY-LEVEL PROBLEMS [Chap. Ill 

reflected waves. Unfortunately the methods here employed are not 
powerful enough for this purpose.* 

The same transmission coefficient is derivable from (21*39) for 
particles incident from the right. As pointed out to the writer by 
Dr. Eugene Feenberg, the equality of the two transmission coefficients 
for particles incident on either side of an asymmetric potential barrier 
is a direct consequence of the constancy of the Wronskian of the cor- 
responding pair of wave functions. 

Formula (21*41) is free from all restrictions depending on the height of 
the hill provided that E — < 0 . It holds not only for the ideal 

case of a parabolic hill but for any analytic potential hill which yields 
no zeros of £ — V(x) near Xi and x^ and which therefore permits a good 
path r joining the regions A and C. 

If the maximum potential energy is less than the energy of the 
incident particles, a parabolic potential function yields two imaginary 
roots of E — V(x), viz., 


Denoting the integral 


r 




by JRl', and the transmission coefficient 


by 0', one readily proves that 


0 ' = 


__ J 

1 + 


E ^ V, 


(21*42) 


The reader will observe that these two expressions for the transmission 
coefficient join continuously at the intermediate case where Xi = 0:2 = 0, 
giving the common value for 0 when the energy level E just touches 
the top of the potential hill. 

1 In case the maxim nm value of V(x) — E is very largo (high hill), the Kramers 
connection formulas are applicable, but their unidirectional character leads to diffi- 
culty. It does not seem possible to set up a legitimate derivation of formulas for c 
and d in this way. 



CHAPTER IV 


THE MATHEMATICAL THEORY OF COMPLETE SYSTEMS OF 
ORTHOGONAL FUNCTIONS 

22. SCALAR PRODUCTS AND SYSTEMS OF ORTHOGONAL FUNCTIONS 

22a. Expansion in a Series of Functions.— It is well known that an 
arbitrary function of x, subject to certain continuity restrictions, can be 
developed into a Fourier’s series 

00 

/(^) = 2(“’‘ ^ t)’ 

n=0 ^ 

which is convergent and represents the function in the interval 

0 ^ a: ^ 2tc. 

A similar development is possible in terms of the Hermitian orthogonal 
functions obtained in S(^c. 20 as the eigenfunctions of the problem of the 
ideal linear oscillator in wave mechanics. In fact it is a property of a 
large class of similar boundary-value problems in one or more dimensions 
that they yield systems of eigenfunctions for which such a series develop- 
ment is both possible and convenient. 

The importance of such a development will be evident from the 
following consideration. We assume that the expansion 
00 00 *' 

/(a;) = (22-2) 

n«0 n=0 

is valid for any f(x) which is quadraticafly integrable and piece-by-piece 
smooth' over the range — « < x < +«>. Then with the aid of the 
eigenfunctions V'n(x) we can at once determine the complete wave function 
for the harmonic-oscillator problem with arbitrary initial con- 
ditions. Suppose, for example, that ^(x,Q) is to have the form f{x). 
The function 

* 2riEnt 

^(^,0 = ^an^n{x)e "h ( 22 * 3 ) 

n-O 

1 A function is said to be continuous {stuckweiae stetig) in an 

interval K when the interval can be divided up into a finite number of subintervals 
such that the function is continuous in each and approaches a finite limit value as x 
approaches any of the subinterval boundaries. It is said to be piece-hy~piece amoM 
iatackweiae glatt) if it also has piece-by-piece continuous first derivatives in K. 

113 



114 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 


reduces to fix) when t = 0, and if the series is term-by-term differentiable, 
it is a solution of the second Schrodinger equation appropriate to the 
problem, viz.^ 

^ ^ ^ q (22-4) 

dx^ ^ ^ 

Thus the series expansion enables us to pass from the special solution 
if'nix) to a solution of the most general typo for the given mechanio.al 
system. It is also of great practical importance because it is the basis 
of most perturbation-theory computations. 

Postponing for the present the discussion of the legitimacy of such 
an expansion as that given in Eq. (22*2) we note that the determination 
of the coefficients is greatly facilitated by the use of Eq. (20*10). To 
fix the value of any coefficient a* we use the same procedure as in finding 
the coefficients of a Fourier^s series. Multiply Eq. (22*2) by 
and integrate term by term over the interval ~ < x < + '^ . Equa- 

tion (20*10) shows that all terms in this sum drop out except the one 

00 

^ yl^n4^n*dx = 1, 

a* = f^y(x)^,*{x)dx. (22-5) 

The integral which forms the right-hand member of Eq. (22*5) is 
called the scalar product of the functions / and ^ and is indicated by the 
heavy round parenthesis symbol (/,^). The idea of the scalar product of 
two functions is of such fundamental importance in the mathematical 
theory of quantum mechanics that we pause here to review briefly the 
complex of ideas regarding the relations between vectors and sets of 
functions with which it is associated. 

22b. Comparison of Properties of Vectors and Functions. — There is a 
very important and fundamental similarity between the properties of 
vectors in a space of n dimensions and the properties of quadratically 
integrable functions of a wset of variables iCi, • • • , This similarity 
lies at the basis of von Neumann^s formulation of the problem of quantum 
mechanics in terms of transformations in Hilbert space. ^ Although 
a thoroughgoing treatment of Hilbert space and its relation to quantum 
mechanics lies outside the scope of this book, a brief discussion of the 
analogy between vectors and functions will be useful. 

The basis for the similarity in question lies in the fact that the 
fundamental operations of ordinary vector algebra, with the exception 
of the formation of vector products, have parallels in important opera- 
tions which may be applied to functions of one or more independent 
variables which are defined and quadratically integrable over a common 

^ (7/. Johann V. Nbumakn, Q&ttinger Naehrickten, ,1927, p. 1; M.G.Q.; Mabshall 
H. Atonu, Linear Transformations in Hilbert Space, New York, 1932. 



Sec. 22] SCALAR PRODUCTS AISID ORTHOGONAL FUNCTIONS 


116 


domain K in the space over which the independent variables range. 
(This latter space we shall hereafter designate as configuration space, 
whether the independent variables Xi, • • • x* are the Cartesian positional 
coordinates of a dynamical system or not.) The first of the operations 
in question are the addition of vectors (or functions and the multiplica- 
tion of a vector (or function) by a scalar or complex number. Both 
operations obey the ordinary rules of elementary algebra whether applied 
to vectors or to functions spread out over configuration space. 

The third basic operation is the formation of the scalar product 
through which the idea of the length of a vector finds expression. The 
introduction of this operation converts the affine geometry of vectors into 
a metric geometry.^ It will be useful briefly to recall the properties 
of the scalar product of two vectors before considering the definition 
of the scalar product of two functions. 

22c. Scalar Products of Vectors. — In the case of two real three- 

dimensional vectors A, B with components A\, Az, and Bi, B^, Bz, 

respectively, the scalar product A • is defined analytically by the 
equation 2 

A - B = X^kBu. (22-6) 

*-i 

This scalar product has the following important properties: 


AB = BA, 


(22-7) 


(oA) • S = o(A • 5) = A • (ai), 


(a = a scalar) (22-8) 


(A + A') ■ B = A ■ B + A' • B, (22-9) 

lA • B\^ g (A • A)(b ■ B), (2210) 

A • A ^ 0, (2211) 

A • A = 0 implies A = 0. (22-12) 


A complex vector in a space of n dimensions is simply an ordered set of 
n complex numbers. We define the Hermitian scalar product of two such 

vectors A, B by the equation* 

(A,B) = 2a*B**. (22-13) 

^ Cf. Hermann Wbyl, Raum, Zeit, Materie, 3d ed, Kap. 1, Berlin, 1920. 

* We use a different notation for the scalar product of two three-dimensional 
vectors from that adopted for the scalar product of two w-dimensional vectors, or of 
two functions, because the two types of product sometimes occur in a close juxtaposi- 
tion which might.be confusing if the same symbol was used for both. 



116 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

Here the complex conjugate of Bk is introduced instead of itself to 

insure that when B is identified with A the scalar product (Ay A) shall be 
real and positive, or zero. The definition makes a slight formal change 
in the first two laws (22*7) and (22*8), which become 

(AyB) = (ByA)* (22.14) 

and 

(oAyB) = a(AyB) = (2215) 

respectively. Equations (22*9), (22-10), (22*11), and (22*12) hold as 
stated for complex vectors as well as for real vectors. 

22d. Scalar Products of Quadratically Integrable Functions of m 
Variables. — Consider next a pair of complex functions, say / and gf, of 
the real variables X\y rc 2 * * * defined and piece-by-piece continuous 
in a certain domain K of configuration space. Wo symbolize the scalar 
product of / and g with respect to the domain K by (fyg) and define this 
quantity by the equation 

(f,g) = • • • dx^. (22-16) 

Usually the domain K is identical with the region of definition of / and g so 
that it need not be indicated explicitly on the scalar-product symbol. 

The notation thus introduced is justified by the complete parallelism 
between the properties of the scalar product of two complex functions 
and the scalar product of two complex vectors, as will be seen by a 
comparison of the above equations with the following easily proved laws. 


U.9) = W. (2217) 

(22*18) 

(/ + /',^) = (/,g) + (/',g), (22*19) 

|(/,g)|^ S (J,S){g,g\ (Schwarz's inequality) (22*20) 
(/,/) ^ 0, (22*21) 

(/,/) = 0 implies / ^ 0. (22*22) 


Equations (22*21) and (22*22) enable us to correlate (/,/) with the square 
of the length of a real vector, or with the square of the absolute value 
of a complex vector. It is sometimes called the norm of / [in symbols 
(Jyf) = Nf] and becomes equal to unity when / is normalized according 
to the rule of Eq. (8*9). Similarly the square root of the norm of / 
is analogous to the length of a vector and may be called the magnitude 
of the function /. It is indicated by the symbol ||/1|, and has the basic 
property 

ll/+i;|| ll/ll + y|. (22-23) 

11/ fi'll is to be correlated with the distance from the end point of one 



Sec. 22] SCALAR PRODUCTS AND ORTHOGONAL FUNCTIONS 117 

vector to the end point of another. As a corollary on Eq. (22*22) we 
deduce that 

11/ ~ (/ll = 0 implies / = g. (22*24) 

In the quantum mechanics we have to do with functions whose range 
of definition K is infinite, and in which K is bounded by finite singular 
domains as well as by a surface at infinity. Hcmce the scalar products are 
improper integrals, in general, and the following theorem is important.^ 

Theorem: The quadratic integrabiliiy of the functions f and g over a 
common domain K implies the absolute convergence of the scalar product 
(f,Q) taken over K and the quadratic integrahility of any linear combination 
of these functions over the same domain. 

Proof: At any point in X, 

i/ffi = i/iiffi ^ 

Hence (/,{/) is absolutely convergent. Furthermore, at any point in K, 

1/ + r/l^ = l/l^ + W + ih* + (22*25) 

As each of the terms on the right is integrable over K it follows that 
\f + g\^ is integrable over K. Introducing the fact that any multiple 
of a quadratic-ally integrable function is quadratically integral)le, we 
obtain at once the general theorem that af + bg is quadratically inte- 
grable if a, b are constants and the integrals (/,/), (g,g) exist. 

The above theorem justifies the last paragraph in Appendix C and 
leads to the following corollary. 

Corollary: Any linear combination of a finite number of type A func- 
tions (for definition see p. 79) is a type A function. 

The n^dimensional Function Space. — On the basis of the definitions 
of the operations of addition, multiplication by a complex number, 
and scalar multiplication, one can obtain an immediate one-to-one 
correspondence between the class of complex functions which are formed 
from linear combinations of any n linearly independent quadratically 
integrable functions f\j f 2 * • * /n of domain K and the class of complex 
vectors in a space of n dimensions. ^ It will be convenient to establish 
this correspondence with the aid of the concept of orthogonality. 

^ Cf. J. VON Neumann, M.G.Q., p. 32. 

* A set of w functions or vectors /i, /2 • • • /n is said to be linearly independent 
if no one of them can be expressed as a linear combination of the others, or if a relation 
of the form 

n 

= 0 , 

iTl 

where the o^s are complex numbers, can hold only if all the a^s vanish. 



118 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS (Chap. IV 

Definition: If the scalar product of two vectors or of two functions 
having a common domain K is zero, we say that the vectors or functions 
are orthogonal. 

Thus Eq.’ (20*10) asserts the mutual orthogonality of the different 
eigenfunctions of Eq. (20*1). 

It follows as a corollary on the definition of orthogonality that any 
set of mutually orthogonal functions is necessarily linearly independent. 
Conversely, it can be proved that if we have given an arbitrary linearly 
independent set of quadratically integrable functions /i, /2 * * * /», 
it is always possible to form a mutually orthogonal set from them by a 
suitable homogeneous linear transformation^ 

n 

^oLikfk = Qi. ^ = 1, 2, • * * , n (22*26) 

Furthermore we can always choose this transformation so that the ele- 
ments of the orthogonal set will be normalized according to the rule 

(gi.Qi) = 1. t = 1, 2, • • • , n 

In this case the functions g are said to form a normal orthogonal set, or, 
commonly, an orthonormal set. 

Let us now employ such a normal orthogonal set to establish the 
previously mentioned correspondence between the class of functions 
which are expressible as linear combinations of the fs (call it class M) 
and the class of complex vectors in space of n dimensions. By means 
of Eqs. (22*26) we can express every function of class M as a linear 
combination of the g^s. Thus, if ^ belongs to class Af, 

n 

V = XoiQi (22-27) 

and the coefficients a* are given by the formula 

n 

= ^ai(gi,gk) = a*. (22*28) 

Every such set of coefficients may be used to define a unique vector 
(ai, a 2 • * • an) in a space of n dimensions, the individual coeifl&cients 
being treated as components along a set of mutually orthogonal axes. 
Conversely, every vector in such a space yields a set of coefficients which 
determine a unique function which belongs to class M, Thus the one-to- 
one correspondence is established. 

We can go much farther than this, however. Let denote the 
vector which is correlated with the function ^ of class M, If we add 

^ Cf. Ooueant-Hilbb»t, Af.Af.P., Kap. II, 81 . 



Sec. 22] SCALAR PRODUCTS AND ORTHOGONAL FUNCTIONS 


119 


two such functions ^ and ^ we obtain a new function, x — ^ which 
belongs to class JIf, and it is obvious that the vector of this new 
function is the sum of the vectors and V^. Similarly, if we multiply ^ 
by an arbitrary complex number c, we obtain a new function of class M 
whose vector is cF^. Finally, if we take the scalar product of two 
functions of class M we obtain a complex number which is exactly equal 
to the scalar product of the corresponding vectors. Thus 

/ n • n \ n 

== I ^CLigi^'^higi 1 = (22*29) 

=» 1 t 1 ^ t « 1 

It follows as a corollary that the magnitude of the function ^ is equal to 
the absolute value of its vector. Consequently we have a basic struc- 
tural similarity between the two classes of entities under consideration. 
They are said to be isomorphous. 

The end points of all possible real vectors drawn from any fixed origin 
in a spac.e of three dimensions fill up that space. Hence it is customary to 
apply the term three-dimensional space to the class of all such vectors. 
Similarly the class of (complex vectors having n components is said to 
form an n-dimensional space and we can equally well apply this term 
to the class of all functions formed by linear combinations of n linearly 
independent functions. To distinguish such a class, or space, from a 
space composed of vectors we call it a function space, 

22e. Spaces of Infinitely Many Dimensions. — More important varie- 
ties of function space are defined by the class of all type A solutions of a 
given Schrodinger equation and by the class, say L2K, of all functions 
quadratically integrable over a common domain K in configuration 
space. These more general classes of functions can be correlated with 
the class of complex vectors of finite magnitude, or “ length, and 
infinitely many dimensions. Such a vector is defined as an infinite 


sequence of complex nximbers ai, a^, as, * * * such that ^anUi* is con- 
i’ » 1 

vergent. The operations of addition, multiplication by a number, 
and scalar multiplication are defined as for vectors in space of n dimen- 
sions.* The correlation of the elements of L^k with the elements of 
this class of complex vectors is most easily formulated with the aid of an 
infinite orthonormal system of functions gi, g2, ^3, * • • in L2K such that, 
if ^ is any other element of L2k^ the completeness relation 


lim 

n— ♦ « 



= 0 


a< = (i.gd 


(22-30) 


^ The infinitely many-dimensioned space defined by this class of vectors is called 
Hilbert space. It belongs along with L 2 K to a class of classes all of which conform to 
the same set of postulates and which is called abstract Hilbert space {cf. references in 
footnote 1, p. 114). 



120 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

is satisfied (c/. Sec. 256, pp. 136 and 137). In physical applications this 
relation is usually equivalent to the validity of the series expansion 

«0 

^ = (yl'.Qi) (22*31) 

but not always (c/. Sec. 36d, p. 255). When such a complete orthonormal 
set of functions is given, we identify the scalar product of ^ and gi 
with the tth coordinate of the complex vector and thereby establish 
the desired correlation. If is a second element of the function space 
L 2 K with the components 61 , 62 , 63, * • • , it can be proved that the 
scalar product of ^ and <l> is given by 

00 

(M) = Xaih*. (22-32) 

J 

To demonstrate this proposition we note first of all that (22*30) is 
equivalent to 

00 

t 1 

Since this equation holds for any element of L^k, it holds for the linear 
combination of functions x = ^ + ^ 0 , where X is an arbitrary parameter. 
Using the procedure of footnote 3, p. 36, we have 

(x,x) = (^,^) + + xx*(«,</>) 

00 00 eo 00 

t*=l i—l -t*l 

Since X is arbitrary, the coefficients of X and X* on the two sides of the 
equation must be equal independently. This proves (22*32). 

. The right-hand member of (22*32) forms the extrapolation to n = 00 
of the scalar product of two complex n-dimensional vectors. We conclude 
that we can identify the scalar product of two functions with the scalar 
product of their vector representatives in Hilbert space. This completes 
the definition of the scalar product of two functions and brings us back 
to the problem of series expansion with which we started the present 
section. 

22f. Proof of Orthogonality of Eigenfunctions of the One -dimensional 
Anharmonic Oscillator Problem. — Although the mutual orthogonality 
of the functions Qi is not a necessary condition for the validity of an 

oO 

expansion such as ^ = Xcugi, it greatly facilitates the evaluation of the 

coefficients in such a series. Without this convenient property or an 
equivalent one {<^. Sec. 23), the determination of the coefficients ifrould 



Sec. 23] 


SELF-ADJOINT OPERATORS AND EQUATIONS 


121 


be difficult, if not practically impossible. We have already seen [Eq. 
(20*10)] that the discrete eigenfunctions of the ideal linear oscillator 
problem of Sec. 20 are mutually orthogonal, and it will now be proved 
that the same property is shared by the type A eigenfunctions of the 
general ohe-dimensional oscillator problem of Eq. (18*2). 

It follows from the above mentioned differential equation that if 
xl/n and ypm are two discrete eigenfunctions having the eigenvalues En 
and Emy respectively, 


dxY’' dx dx ) 


dx^ 


— <1- '‘(p 

rm — v„- (,iin 


dx^ 


SttV/ 




Integrating over all values of x, we obtain 


8irV| 


"]+ » 

[En - En^KMn) = ■ (22-33) 

J — 00 


Sinc(i the eigenfunctions and their derivatives vanish at the boundary 
points X = ±<^ j the right-hand member of Eq. (22*33) is zero. It 
follows that \pn and \l/m are orthogonal unless En = Em^ in which case one 
of the functions is a multiple of the other. 

The generalization of the above theorem for other one-dimensional 
problems will be given in the next section, and in Sec. 32d an extension to 
problems having many dimensions will be given. 


28. SELF-ADJOINT OPERATORS AND EQUATIONS. THE 
STURM-LIOUVILLE PROBLEM 

23a. Self-adjoint Differential Operators in One Dimension. — Since 
we shall encounter more general one-dimensional eigenvalue-eigenfunc- 
tion problems than that of Sec. 18, it is important to investigate the 
conditions under which a problem of this type based on a second-order 
linear homogeneous differential equation will lead to a sot of iputually 
orthogonal eigenfunctions. In other words, we need to develop an 
extension of the orthogonality theorem of Sec. 22 to equations of a 
more general character. 

It is convenient to introduce the Hamiltonian operator 

and to rewrite Eq. (18-2) in the form 

Hy-Ey = 0. (23-2) 

The essential feature of the proof of the orthogonality of the eigenfuncr 
tions is evidently to be found in the identity 

zix)Hy{x) - y(x)Hz{x) s ^ ^y^ - 2 ^], (23-3) 



122 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 


which holds for arbitrary continuous and twice-differentiable functions 
y{x) and z{x). We therefore seek a generalization of Eq. (23-3). 

Consider the second-order linear homogeneous differential equation 

PoWt/" + Vi{x)y' + p 2 {x)y + \p{x)y ^ 0. ^ (23-4) 

Here X is a parameter which is to play the same role as E in Eqs. (18*2) 
and (23-2), while y' and y” denote, respectively, the derivatives dy/dx and 
d^y/dx^. The coefficients po{x)f pi(x), P 2 {x)j p{x) are arbitrary but 
supposedly known functions of x^ defined and continuous in the interior 
of a certain interval X, say a < x oi the x axis. Either a, or or 
both, may eventually become infinite. The functions po(x), p\{x)y p^ix) 
are permitted to take on complex values, but p{x) and the independent 
variable x itself are restricted to real values. We further assume that 
p{x) is of the same sign throughout the interval K, Introducing the 
symbol A for the linear operator, 

d 

^ ~ X, (23-5) 

we rewrite (23-4) in the symbolic form 

A 2 / + \py = 0. (23*6) 

I^et us now assume that the differential operator A satisfies a relation 
of the form 

z*{x)h.y{x) - y(x)A*z*{x} s (23-7) 

where y and z are arbitrary functions and F denotes a bilinear form in the 
arguments y, y'; z*, z*'. In other words, 

F s ao(x) 2 /'z*' + ai(x)y'z* + a 2 {x)yz*' + a8(x)yz*^ 

where ao(x), ai(a;), a 2 (x), az{x) are functions to be evaluated from po(a;), 
pi{x), p 2 (x). z*{x) is the complex conjugate of z(x), and A*** is the 

operator which transforms z* into (Az)*. It is derivable from A by 
replacing po{x), pi{x)y p 2 {x) by their complex conjugated. A differential 
operator A for which a relation of the form (23*7) is valid is said to be 
self-adjoint.^ 

1 In the case of a real ffifferential operator the symbol A* can be replaced by A 
and the definition of the self-adjoint property given above reduces to that stated in 
Courant-Hilbert, M.M.P. and other standard texts. In general, whether A conforms 
to (23-7) or not, there is a second-order differential operator At such that 

z*Ay - 

where G again denotes a bilinear form in the arguments y, y'; z*, z*' with coefficients 
to be evaluated from po, Pu Vi- At is then said to be adjoM to A. However, a differ- 
ent definition of the term adjoint ” will be found useful at a later stage in the develop- 
ment of our theory (c/. Sec. 32, p. 203). 



8ec. 23] 


SELF-ADJOINT OPERATORS AND EQUATIONS 


123 


If Eq, (23*7) is to hold for all functions y and the coefficients of 
each of the products yz^y yz*\ y'z*, • • * on the two sides of the equation 
must be equal. Hence it follows^ that A must have the special form 

^ + [ I” + 

where po, fj and g are real functions of x, and i denotes the square root 
of — 1. The function F becomes 

F = po(z'*'y' - yz*') + ifyz*, (23*8) 

Integrating Kq. (23.7) we obtain the Greenes formula 

J\z*Ay - yA'^z*)dx = F(b) — F(a). (23*9) 

The next step in the derivation of our orthogonality theorem is to assume 
that the self-adjoint equation (23-6) is to be solved subject to boundary 
conditions at a; = a and x = b. Identify y and z with two eigenfunctions 
2/1 and 2/2, respectively. Then 

Ay = Ay I = -Xip2/i, 

A*z* = A*2/2* = -\2*py2*^ 

Equation (23-9) becomes 

(X2* — Xi) f^pyiy 2 *dx = lim 4>(x) — lim ^(x), 

^(x) = po(x)ly 2 *(x)yi'(x) - yi(x)t/ 2 *'(x)] + if{x)yi{x)y 2 *{x). 

If the boundary conditions insure that the right-hand member of this 
equation shall vanish, and if X2* 9 ^ Xi, we see that the functions p^^y^ 
and p^^yz are orthogonal in the region K, If 2/1 and 2/2 are identical, the 
integral cannot vanish because p does not change sign in K, and it follows 
that X* == X. Thus, the eigenvalues are all real, 

23b. Orthogonality with Respect to a Density Function 9. — In the 
special case that p is constant, the eigenfunctions of Eqs. (23*6) and 
(23’5a) with suitable boundary conditions form an orthogonal system. If 
p is not constant, we may say that the functions are orthogonal with respect 

» The equality of the coefficients yields the following relations: 

0 * uo, («) 0 = fli H- 02 , 

po * pi ~ 

Po* * -O 2 , ( 7 ) Pi* ~ -<* 2 ' ““ 

P2 - Pi* - Cts^- M 

Equations (/?), ( 7 ), (5) show that po must be real. The next pair of equations 
show that the real part of pi is equal to po' while the imaginary part is equal to a*. 
Setting Us equal to we learn from the last equation that the imaginary part pi 
Pi is /'A 


W 

(*) 

(r) 


(23- 10) 



124 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

to the function p. If the functions p^yn are quadratically integrable, 
it is convenient to normalize them so that 

Cpt/nyVin) ^ pynym^d^ ~ (23*11) 

The problem of expanding an arbitrary function in terms of a system 
of eigenfunctions having this modified type of orthogonality is no more 
difficult than in the case of direct orthogonality. If it is required to 
evaluate the coefficients in the series 

00 

/(a:) = (23-12) 

n =0 

we have only to multiply through by pyk* and integrate in order to obtain 
a* = fp{x)f{x)yk*{x)dx = (pf,yk). (23-13) 

If p^fix) is quadratically integrable over (a, 6), or if \f(x)\ is bounded 
and pyk{x) is absolutely integrable, the coefficient ajt is sure to exist. 

23c. The Stunn-Liouville Problem. — Usually the coefficients in Eq. 
(23 -Sa) are restricted to real valuosi Then the most general case in which 
the equation is self-adjoint is that in which it has the form^ 

UM) 

This is the Sturm-Liouville equation whose boundary-value problems 
are so fully discussed in various mathematical texts. It is assumed 
that all three of the coefficients p(x), q{x)j p{x) are continuous in the 
interior of the fundamental interval a < x <b and that p{x) and q{x) 
are differentiable in the same region. To eliminate the possibility of a 
singular point in the fundamental interval we assume that p(x) is positive 
throughout that interval. Finally we require that p(,x) shall be of one 
sign throughout the closed interval a ^ x ^ h. 

^The notation for the coefficients of Eq. (23* 14) has been altered from that of 
Eq. (23.5a) to conform to the usage of Courant-Hilbert. 

The general second-order linear homogeneous differential equation (23-4) with 
real coefficients is thrown into the Sturm-Liouville form (23.14) if multiplied through 
by the function 

/(x) = ■*'- (o) 

If the coefficients po, Pit P 2 are continuous in the interval a < x < and if po has no 
nodes between a and &, the equation will have no singular points in the san^e interval 
(c/. footnote 1, p. 79), ojidfix) will not vanish or become infinite there. We conclude 
that if Eq. (23*4) has no singular points in the interval a < x < 6 it may be converted 
to Sturm-Liouville form without introducing such points. 



Sec. 23] 


SELF-ADJOINT OPERATORS AND EQUATIONS 


125 


Equation (18*2) is a simple special case of the Sturm-Liouville equa- 
tion in which p and p are constants. A more general example arises in 
connection with the study of the vibrations of a stretched string with 
variable density and variable elastic modulus. The problem of solving 
Eq. (23-14) in the interval a<x<b subject to linear homogeneous 
boundary conditions such as 

71 ^( 6 ) + rS'S = oil ^ constants (23-15) 

or 

y(a) = yib), p{a)y'{a) = pib)y'(b), (23-16) 

is called the Sturm-Liouville problem. If and 72 are zero, we have the 
important special case in which y{x) is required to vanish at the boundary 
points. These boundary conditions insure that if y{x) is a solution of 
the problem, any constant multiplied into y{x) will yield another solution 
for the same value of X. Furthermore, the real and imaginary parts of 
every solution of a Sturm-Liouville problem are themselves solutions, 
so that the most general solution for a given value of X can be obtained 
from the corresponding real solution, or solutions. There is then no 
apprecdable loss of generality in restricting the discussion to real functions 
y{x). The boundary conditions given above also have the property of 
reducing the right-hand member of Eq. (23-10) to zero and hence yield a 
set of eigenfunctions which are orthogonal with respect to p{x) in the 
interval a < x <b. 

The fJfsual theory of the Sturm-Liouville problem assumes that the 
interval a < x < b in finite and that the boundary points as well as all 
interior points are nonsingular, ^ In the wave mechanics, on the other 
hand, we meet a variety of one-dimensional eigenvalue problems based 
on equations of the form of (23-14) but involving singular end points 
and, frequently, infinite intervals a < x < b. These equations are 
usually obtained from multidimensional problems by the method of 
the separation of variables. Thus we are confronted with a need for 
an extension of the standard Sturm-Liouville theory covering the type of 
problem which arises in our physical discussions. Such an extension 
is sketched in this and the two succeeding sections. ^ 

23d. Singular-point Boundary Conditions. — Every eigenvalue-eigen- 
function problem based on a Sturm-Liouville equation and of physical 

^ Cf. footnote 1, p. 79. 

® The singular end-point problems are of course an old story to mathematicians, 
but the writer has not been able to discover any comprehensive treatment of a suffici- 
ently elementary character to meet the needs of physicists. The powerful and elaborate 
work of H. Weyl, Moih. Ann. 68, 220 (1910); Nachrichten d. Kgl. GeseU. d. Wissen- 
achaftm zu Gdttingen^ Malh.-phya. KlmBe^ 1910, p. 1) covers most of the ground, but 
has for its primary purpose the study of the continuous spectrum and is not adapted 
to the requirementii of beginners. 



126 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

origin will make use of boundary conditions based on physical con- 
siderations. In wave mechanics these physical boundary conditions 
can be derived from the boundary and continuity conditions for type 
A and type B functions given in Sec. 17, or from the more elaborate 
conditions for physically admissible many-dimensional wave functions 
formulated in Sec. 326. Examples of such derivations are to be found in 
Sec. 28. Owing to the variation in the physical boundary conditions 
from problem to problem, and to a certain awkwardness in form, however, 
it is convenient for the development of the general theory of the proper- 
ties of eigenfunctions to replace them by a single, suitably chosen, 
standard mathematical condition to which they are normally equivalent 
in practice. 

To this end the following singular-point boundary condition {s,p.h,c.) 
for discrete eigenfunctions is proposed.^ The function y{x) shall be said 
to conform to the singular-point boundary condition for Eg. (23 T4) at the 

left-hand boundary point x = a if j*^^p\y\^dx exists when a < { < 6, and 

when a positive real number € and a real number m exists such that for 
positive values x — a in the neighborhood of x — a = 0 the functions 

p{x)\yWx - a)”"*, v{x)\y'\’^{x - (2317) 

are bounded. The numbers e and m are to be independent of the parameter X 
whose eigenvalues are sought. The corresponding condition for the right- 
hand boundary point 6 is similar in form and need not be written down 
explicitly. Either or both of the points a,6 may be at infinity'” 

Although the above condition will seem somewhat formidable to the 
average physicist, it is not difficult to apply. In fact we can ordinarily 
set m equal to c and so replace the requirement that the functions 
(23*17) shall be bounded near ar = a by the simpler requirement that 
p(^)l2/T «hall be bounded near x = o, while lim p{x)\y\^ = 0. The 

x-*a . 

radial functions in the Dirac relativistic theory of the hydrogen atom 
afford examples of the exceptional case of a function which satisfies 
the more general condition but not the simplified one. 

The singular-point boundary condition is said to be equivalent to the 
physical boundary condition at a; = a in any particular case if an integral 
curve of the given differential equation which conforms to the s.p.b.c. 
at that point must also conform to the physical condition and vice versa. 
The possibility of such equivalence in -the case of two conditions not 
really identical lies in the fact that the functions under consideration 
are solutions of the differential equation and hence have only one or two 

1 Cf, B. C. Kbmbls, Proc. Nat. Acad. Sci. 19 , 710 (1933), where the condition is 
l^ven a slightly more general form. The article should be corrected by the insertion 
p. 711 of the additional stipulation that the function g(x) is bounded in the neigh- 
borhood of a? »> o. ' 



Sec. 23] SELF-ADJOINT OPERATORS AND EQUATIONS 127 

types of behavior near x = a. This equivalence must be tested indi- 
vidually for each special problem by a study of the integral curves in the 
neighborhood of the singular boundary point under consideration. 
Methods for dealing with this question will be described later in this chap- 
ter. In the meantime it may be observed that the writer has found no 
case in which the equivalence of the physical boundary condition and the 
s.p.b.c. cannot be established without difficulty. 

Lot us next turn our attention to the general properties of the s.p.b.c. 
which make it useful. 

a. If u{x) conforms to the s.p.b.c. at a; = a, 

lim p(x)\u\‘^ = 0 . 

x—^a 

b. If Mi(x) and U 2 ix) conform to the s.p.b.c. at x = a, 

lim {p(x)|mi||m 2'|1 = 0. (23-l'8) 

x—*a 

This relation follows directly from the fact that the square root of the 
product of p{x)\ui\^{x — a)~”^ and p{x)\u 2 Wx — is bounded near 

X — a. 

c. If ui(x) and u^ix) conform to the s.p.b.c. at a: == a, any linear 

combination, say w(x) ^ otUi{x) + will conform to the same 

condition. This is a consequence of the inequality 

\w\^ < 2[\aui\^ + 

(c/. p. 117). In this respect the singular-point condition shares one of the 
major properties of the homogeneous boundary conditions for regular 
boundary point problems [Eq. (23*15)]. 

d. Let yi(x) and y 2 {x) denote two linearly independent solutions of 
the Sturm-Liouville equation (23*14) for the same energy value. Then 
both of these cannot conform to the s.p.b.c. at either end point. 

In order to verify the last of these properties it is convenient to intro- 
duce the function 

W[yi,y 2 ] s p{x)[yi{x)yt'ix) - yi{x)yt{x)]. 

It follows from the differential equation that this function is independent 
of X. If both yi and 2/2 satisfy the s.p.b.c. at cither boundary point it 
follows from the property b that Wlyi^y^ vanishes identically. As p{x) 
is not. zero, we conclude that 

y\{x)Viix) = y2(a;)j/i'(x). 

Integration of this equation shows that j/i and 3/2 are linearly dependent 
contrary to hypothesis. Hence both of them cannot satisfy the s.p.b.c. 

An important corollary on the property b is that if yi and yt are eigen- 
functions of a Sturm-Iiouville problem with singular-point boundary 



128 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

conditions and. the eigenvalues Xi and X 2 , respectively, the function <i>(a:) 
of Eq. (23-10) vanishes at the end points. Hence the eigenvalues are 
real and the eigenfunctions are orthogonal with respect to the 'density 
function p. In this respect also the new boundary condition shares a 
fundamental property of the homogeneous boundary conditions of 
Eq. (23*15). We shall therefore classify eigenvalue-eigenfunction prob- 
lems based on Eqs. (23*14), (23*17), and (23*18) along with those based 
on homogeneous conditions at regular boundary points as Sturm-Liouville 
problems. It follows from d that there cannot be two linearly inde- 
pendent eigenfunctions for such a problem with the same eigenvalue. 

Solutions of the standard linear oscillator jiroblem of Secs. 18 and 19 
for Case II are readily seen to satisfy the singular-point boundary condi- 
tions and hence all general theorems proved for Sturm-Liouville problems 
with singular boundary points are applicable to the dis(;rete spectra of 
linear oscillator problems. 

*23e. Existence of Discrete Eigenvalues for Sturm-Liouville Prob- 
lems with Singular End Points. — We have already discussed the existenc^e 
and properties of discrete solutions of the standard linear-oscillator 
problem in Sec. 19. In order to deal with the general case of an arbitrary 
S.L. problem with singular end points, we postulate the existence of an 
interval (? of X values, say X' ^ X ^ X", such that for every X in G there 
exists a pair of integral curves u\{x), V\{x) which conform to the s.p.b.c. 
at the boundary points a and 6, respectively, and which do not have an 
infinite number of nodes in the neighborhood of these points. Proper 
normalization of these integral curves will then yield a pair of continuous 
functions of x and X, say w(a;,X), v(x,X), which for any given X in (? satisfy 
the conditions laid down for u\ and vx. 

If the function 

W[u(x,\), t^(x,X)] s p(x)[u(xy\)v\x^\) — v(x,X)ii'(x,X)] 

vanishes for any pair of values of x and X, say f and X, it vanishes iden- 
tica-lly in X for X = X and the functions w(a:,X), t;(a7,X) are linearly depend- 
ent. Hence both functions satisfy the s.p.b.c. at both of the end points, 
and % is an eigenvalue with u{Xj%) and v{Xj%) as eigenfunctions. Every 
such eigenfunction has a finite number of nodes. 

As X increases, the spacing of the nodes of te(x,X) and v{Xy\) decreases. 
The nodes of w(x,X) move to the left and those of t;(x,X) to the right. We 
omit the proof of this proposition as it is readily carried through by 
standard methods (c/. footnote 1, p. 84), if we note that the function 

W[u{x,\i)j u(a:,X2)] s p(a;)[w(x,Xi)u'(a:,X2) -- w(a:,X2)w'(a:,Xi)] 

must vanish at a; = a, while W[v(a;,Xi), v{Xf\ 2 )] must vanish at a: = 6. 

Whenever a node of u(xy\) passes one of v(x,\)y the function 

W[u{Xy\), v(x,X)] 



Sec. 23] 


SELF-ADJOINT OPERATORS AND EQUATIONS 


129 


must vanish at the common nodal point. Hence every such nodal 
crossing yields an eigenvalue of X and an eigenfunction. There are no 
eigenvalues of X for which the nodes of u(Xj\) do not coincide with 
the nodes of 

Let n' be the number of internal nodes of u(Xj\') (lower limit of (?), 
and let w" be the number of internal nodes of (upper limit of 

G), For simplicity we assume that X' and X" are not eigenvalues and 
that n' is not zero. It follows at once from the above argument that 
there exist n^' -* n' eigenvalues in the interval G, each having an eigen- 
function with one more node than the eigenfunction of next lower eigenvalue, 

3?he case in which n' is zero requires special discussion since the eigenfunction of 
lowest eigenvalue has no internal nodes and cannot be obtained in general from a 
coincidence of the nodes of w(a;,X) and It is convenient to change the depend- 

ent variable of Eq. (23- 14) to 

w{x) = p{x)\^y{x). (23-19) 

The differential equation now takes the form 

w"i.x) + - q) - = 0. (23.20) 

Let t«i(a;,X) and denote the functions p(a;)V^u(a;,X) and p(x)Hv(a;,X), respec- 

tively. Then wi vanishes at a; = a, and Wz x ^ b. iy[w(a:,X), t>(aJ,X)] becomes 

WiWz* — WzWi'. 

If n' = 0, and ~ 0, the functions Wi and Wz are linearly dependent node- 

less eigenfunctions of the eigenvalue X'. If iCi(6,X') ^ 0, the curv^es Wi(Xf\') and 
WiiXfy') can be drawn in the upper half plane and will then intersect at some point 
X = ^ wdiere w/ < Wz. It follows that 

Tr[w(x,X'), t;(x,x')] = < 0. 

Let \g denote a value of X in the range G such that WiiXyXg) has a single interior node 
lying to the right of the corresponding node of Wz(Xy\g). Any eigenvalue of X between 
X' and Xfl must have a nodeless eigenfunction. But Wi(x,X(;) ancl WziXyXg) must cross 
at some point ^g where w/ — Wz' < 0. If we now vary X between X' and X^, there 
must be some value of X for which W\' — . Wz vanishes at the point where the curves 
cross. The function iy[u(x,X)» v(x,X)] vanishes for this value of X and the correspond- 
ing functions u(x,X), v(a;,X) 8 - 1*0 linearly dependent eigenfunctions. Thus, if X' and 
X" are not eigenvalues, the number of eigenvalues in G is always n" — n', even if w' 
is zero. 

If n" > 0 and the lower limit W of G is at — 00 , there is a minimum 
eigenvalue Xo with a nodeless eigenfunction. 

To prove this theorem we have only to verify that for sufficiently 
large negative values of X the curves for Wi{Xj\) and W2{x^\) when drawn 
in the upper half plane will intersect at a point where wi > W2. As a 
very large negative value of X will make both curves convex to the 
axis in all but eventual infinitesimal neighborhoods of the end points, 
we may dispense with a complete analytic proof. 

If the upper limit of G is + , the sequence of ascending eigenvalues is 

denumercMy infinite. If Xn is the eigenvalue whose eigenfunction has n 
interior nodes f lira K = 



130 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

In view of the above series of theorems the question of the existence 
and range of discrete eigenvalues for a Sturm-Liouville problem, subject 
to the singular-point boundary conditions, reduces to the problem of 
determining the range G (if any) in which the boundary conditions can 
be satisfied by real integral curves with a finite number of nodes. In 
case one or both of the boundary points is at infinity we can make the 
change of variables (23*19) and investigate the existence of solutions 
of the equation satisfying the boundary condition at infinity by the 
methods of Sec. 19. Singular boundary points located at finite points 
can frequently be tested by methods to be developed in Sec. 26. 

In the case of a Sturm-Liouville problem with a finite fundamental 
region and regular end points subject to homogeneous boundary con- 
ditions, a parallel set of theorems can be proved. It is possible to show 
in particular that there is a denumerably infinite set of eigenvalues 
each having an eigenfunction with one more node than that of the. 
next lower eigenvalue. There is a minimum eigenvalue with a nodeless 
eigenfunction. 

24. REDUCTION OF EIGENVALUE PROBLEMS BASED ON SELF-ADJOINT 
DIFFERENTIAL EQUATIONS TO VARIATIONAL FORM^ 

Eigenvalue problems based on the differential equation 

Av + Xpy - 0, (24*1) 

in which A denotes the self-adjoint operator of Eq. (23*5a), can be 
reduced to problems in the calculus of variations. This reduction is 
of great value both in the solution of concrete cases and in the develop- 
ment of general theorems. It can take a number of forms which we 
designate as ^4, .8, C. We here summarize the results of calculations 
given in detail in Appendix E. 

Variaiional Problem A , — ^Let J denote the integral 

JIj/.X] = + >^py]dx. (24-2) 

Let dy denote the first variation in y{x) {cf. Appendix A) and let F{x) be 
defined by 

F{x) = Po(y*-^^ - ^ + ify*Sy. (24-3) 

Let it be required to find a function y{x) and a corresponding value for the 
parameter X which satisfy the variational equation 

U = 0, (24*4) 

1(7/. Ooubant-Hilburt, M.MP^, Chap. VI; Eiismann-Wsber, DP.y voL I, 
Chap. XX. 



Sec. 24] VARIATIONAL FORM OF EIGENVALUE PROBLEMS 131 

when the comparison functions are unrestricted except for the require- 
ment of piece-by-piece continuous first and second derivatives, the condi- 
tion that must exist, and for certain linear boundary conditions 

(a) which insure that^ 


lim F{x) — lini F(x) = 0. (24*5) 

x—*a r--*b 

The functions y{x) wliich solve this problem are called extremals. 

Theorem a.^ — The required stationary values of J are all zero. They 
exist only for a set of eigenvalues of X identical with the eigenvalues of 
Eg. (24*1) when solved subject to the same boundary conditions (a). The 
extremals of the variational problem and the eigenfunctions of the differential 
equation problem are identical. 

Variational Problem B. — Let Q and N denote the integrals 

Q[y] = -J^y'^Aydx; N[y] = J^'pyy*dx, (24-6) 

Stationary yalues of the ratio Q/N are required when the admissible 
comparison functions are subject to the same restriction as in the preced- 
ing problem and to the additional restriction that they shall yield conver- 
gent values of Q and N. (Since the continuous spectrum eigenfunctions 
are not quadratically integrable, the present method can yield only dis- 
crete eigenvalues.) 

Theorem b.^ — The stationary values of Q/N are identical with the eigen- 
values of X in the corresponding problem A and the extremals are identical 
with those of problem A. 

Every Sturm-Liouville problem with a finite fundamental interval, 
regular end points, and homogeneous boundary conditions can be 
reduced to the form B. Sturm-Liouville problems involving infinite 
^ intervals and singular end points, subject to the singular-point boundary 
conditions can be reduced to the form B provided that the eigenfunctions 
yield convergent integrals N. This restriction is automatically fulfilled 
by the physically admissible class A wave functions of quantum mechanics. 

Variational Problem C. — ^Let it be required to find a function y{x) 
which yields a stationary value of Q[y] itself when the admissible com- 
parison functions are subject to the restrictions of Problem B and also 
to the normalization condition N[y] == 1. 

1 This requirement is a generalization of the boundary requirement for the orthog- 
onality theorem of Sec. 23. Hence the latter requirement is always fulfilled if Eq, 
(24-5) holds for all admissible comparison functions. Evidently the condition that y 
shall vanish at the boundary points is normally sufficient. . If A is real (Sturm-Liouville 
case), we may deduce Eq. (24-5) from the singular-point boundary condition [cf. Eq. 
(23-18)1. 

• For proof qf. Appendix E. 



132 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

Theorem c .^ — Every solution of Problems A and B when normalized 
by the introduction of a suitable constant factor yields a solution of Problem 
C. There are no other solutions of the latter problem. The stationary 
values of Q for the solutions of Problem C are the eigenvalues of \ for Problem 
A, 

Application to Sturm-Liouville Case, — An integration by parts yields 
the following alternative expression for the integral Q: 

Q[y\ = J] - iSy*y' - + ^^ 2 / 2 /* jrfx + po{a)y*{a)y'{a) 

- Po{b)y*{b)y'{b). (24-7) 

This expression is particularly useful in the case of the Sturni-Liouville 
problems where the coefficient f(x) vanishes identically. In applying 
Eq. (24*7) to this type of problem we revert to the notation of Eq. (23-14), 
replacing po{x) by p(x) and g{x) by —q{x). If the boundary conditions 
are of the singular-point type [cf, Eqs. (23*17) and (23-18)], we have 

Q[y] = flp\y'\^ + (24-8) 

for all admissible comparison functions. 

26. COMPLETENESS OF SYSTEM OF DISCRETE EIGENFUNCTIONS OF A 
STURM-LIOUVILLE PROBLEM 

*26a. The Eigenvalues as Absolute Minima. — A number of the most 
important properties of the Sturm-Liouville eigenfunctions and eigen- 
values can be deduced from the application of the variational method to 
problems of the S. L. type. Courant and Hilbert have been leaders 
in the development of this mode of attack and readers are referred to their 
text for a more complete exposition of the results obtainable. We are 
interested here in the problem of the expansion of arbitrary functions in 
series of eigenfunctions of Sturm-Liouville problems and will follow the 
line of argument developed by Courant-Hilbert, indicating at the same 
time the way in which it can be extended to cover problems involving 
singular end points and infinite fundamental intervals. 

Let us then consider a Sturm-Liouville problem having either regular 
end points or singular end points. If the boundary points are singular, 
we shall suppose that for each value of X thei’e is a one-parameter family 
of integral curves au(x,\) having a finite number of nodes which satisfies 
the left-hand boundary condition, and another such family ^v{x,\) 
which satisfies the right-hand boundary condition, whereas integral 
curves not linearly dependent on w(x,X) or viXjX) do pot satisfy the cor- 
responding boundary condition. The interval G of Sec. 23e will therefore 
extend from X*— <»toX=+<». In any such case it was proved in 

'Par proof (f. Appendix E. 



Sbc. 25] COMPLETENESS OF SYSTEM OF EIGENFUNCTIONS 


133 


Sec. 23e that there exists a nodeless eigenfunction yo{x) and a correspond- 
ing lowest eigenvalue Xo. The eigenvalue-eigenfunction problem is 
reducible to the variational forms B and C. 

It will now be proved that Xo is not only a stationary value of Q/N as 
required by Sec. 24, but that it is an absolute minimum of Q/N. 

The transformed eigenfunction obtained from ?/o(x) by Eq. (23-19) will be denoted 
by Wo{x). It is a solution of Eq. (23-20) which vanishes at the end points but has no 
interior nodes. If we express the integral Q of Eq. (24-8) in terms of wix)^ it takes the 
form 

^ ^ £[ " I p^y + 

while the normalizing integral becomes 

N = (25-2) 

Here w{x) is assumed to be real. 

Our discussion is based on a theorem in the calculus of variations which states that 
if 2 / = f/ {x) is a nodeless extremal of the integral 

my] = C'F(.x,y,y')dx 

Jxo 

subject to the boundary conditions 

2 /(x(,) = a, y(xi) = /3, 

and if Fy'y'ix^y^p) > 0 for every point x^y in the neighborhood of x, U(x) and for every 
finite value f>, then U actually minimizes the integral D.^ It is usually understood 
that the end points x^yXi are regular, although, as we shall see, the theorem must hold 
for singular boundary points if it holds for regular ones. 

We identify D with the integral 

{py''‘ + qy^ - \py‘‘)dx = J]! [("''" ^ ^"’) + p”’* “ (25-3) 

restricting 'p{x) to positive values as usual, and choosing a' and b' according to 
a < a' < b' < b. We identify the nodeless extremal U{x) with wo(x) == p^yoix)^ 
choosing the) end-point conditions to suit, i.c., writing them in the form 

w{a') = a = WQ{a')y yf(b') ^ s woib'). (25-4) 

In the (^ase of integrals of the quadratic type under consideration it can be shown that^ 
any minimum must be an absolute minimum* so that ^ J[wM\ for all com- 

^ Cf. O. Bolza, Lectures on the Calculus of Variationsy 22, p. 96, 1924. In stating 
the theorem we use the symbol Fy'y'iXyijyp) for 

d*F(a;,y,g) | 

* The second variation of is defined as 

The first variation must vanish and ^ 0 for every admissible variation ri{x) if 
Wq minimizes J. In the case of a quadratic integral of this type 

“h -j- ^ 0- 

As it?o -f is an arbitrary admissible comparison function, we infer that the mini- 
mum is absolute. 



134 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 


parisoti functions w which conform to the boundary conditions (25-4). We designate 
the original problem of minimizing Q/N subject to the singular-point conditions and 
the problem of minimizing J subject to (25-4) as problems X and T, respectively. 

Let 0 and ft be defined by 

Q[w] = ^ ~ J*f (25-5) 

Then (c/. Sec. 24 and Appendix D), 


JlwofXo] = Qlwo] — XoViuJol = 0, 


Xo - 


Qlwo] 


(25-6) 


AW 

If ri(x) is an arbitrary continuous function of x, with piece-by-piece continuous first 
derivatives, which vanishes at the end points o', 6', 


Finally 


J[wq -f* i?,Xo] — J^[u>o,Xo] = Q[t£>o + 1?] — Of^o] — XoiStlwo -f 17] 


QIwq 4- y] > SW ^ ^ 
fflwo + 17] V[wo] 


0 . 


(25*7) 


so that Xo is an absolute minimum for Q/N- 

If we were interested only in problems involving fixed regular end points we might 
have' identified o' and 6' with a and h from the beginning. Our theorem would then 



be proved. As it is we must resort to a limiting process to show that Xo is a minimum 
for Q/N as well as for Q/N. 

Let y(x) denote any admissible comparison function for the original Sturm- 
Liouville problem X in the complete interval a < x <h and let w = p^y. w does 
not automatically satisfy the boundary conditions of problem Y at a' and 6', but it is 
easy to devise a function which will satisfy them and which will be identical 

with w(x) except in the immediate neighborhood of the points o' and b'. Furthermore, 
since w(x) and u>o(x). vanish at a and b, we can choose w(a\h%x) so that it becomes equal 
to w(x) in the limit when a* ^ a and b' « b,‘ and so that 

^ ON. 

o',b'-^a,6 Nlw] N[w] 

^ Suppose, for example, that the graph of w{x) lies above that for Wo(x) in the 
neighborhood of the points a? * o and x ~ b. Then for points o' and b' not too far 
from a and b, respectively, the quantities /(o') and g{b') defined by 

/(o') « tt>( 0 ') - Wo(^, gm a W(b0 - Worn 
are positive. Under these circumstances we can choose /iU (o', 6', a;) as follows: 




SBC. 25] COMPLETENESS OF SYSTEM OF EIGENFUNCTIONS 135 


In view of Eq. (25-7) (i[w]/N[w] cannot approatih a limit less than Xo and we conclude 
that 


which was to be proved. 


Q[w] . 


Xo, 


It is possible to show that each of the higher eigenvalues is also a 
minimum value of Q/N for a suitably restricted class of admissible 
comparison functions. To be precise, let yn{x) denote an eigenfunction 
having n internal nodes. Then its eigenvalue \n is the minimum value 
of Q/N for all comparison functions which are orthogonal to p^/o, pyi, 
py2j • ‘ • y PVn-i and, in addition, conform to the usual continuity and 
boundary conditions.^ 

*26b. The Expansion of Arbitrary Functions in Terms of Eigenfunc- 
tions. — As we have already observed, the problem of the expansion of an 
arbitrary function in terms of an infinite series of mutually orthogonal 
functions is of fundament^,! importance in wave mechanics as in the 
older branches of mathematical physics. The first step toward the 
justification of such an expansion is to show that one can approximate 
the function as closely as desired in the sense of the theory of least 
squares by means of a suitable linear (iombination of members of the 
given orthogonal system. 

To be specific, let Uo(x)^ Ui(x)y U2(x)j • * • denote an infinite series 
of functions spread out over the finite or infinite fundamental interval 
a < X < b in .which they are mutually orthogonal, quadratically inte- 
grable, and normalized. Let f{x) denote an arbitrary function which is 


ii5(a',6',x) - V){x) 
wia'fi'jx) - w{x)f 
w{a\b\x) = w{x) - 


a' < X < a' -f / 
a' f < X < b' — g 
b' - g < X <N 


The boundary conditions (25-4) are satisfied, and although w(a'fb^fX) has a discon- 
tinuous slope at the points x = a' 4 -/( 0 ') and 2 ; = 6' — g(6'), it is an admissible 
comparison function for problem Y. The reader will readily verify that in this case, 
since /(a) * g{b) = 0, ir)(aM = wix) and 


lim 

a',b'-*a, 


/ 0 [^] \ ^ 
b \NlwV 


QM 


1 In the case of a problem involving a finite interval and regular end points it is 
proved in Courant-Hilbert, M.M.P., Kap, VI, §1, that if Q/N has a minimum for 
the specified type of comparison function, it must be Xn. The existence of the mini- 
mum has been implicitly proved by Morse (c/, Marston Morse, “Sufficient Conditions 
in the Probjera of Lagrange,” Am. J. Math. W, 517 (1931); Kuen Sen Hu, Theorem 
10.3, Tke^, University of Chicago, 1932, “Contributions to the Calculus of Varia- 
tions,” Chicago, 1933). The extension of the theorem to the case of problems with 
singular end points or infinite fundamental intervals can be carried through as in the 
case of the lowest eigenvalue. 



136 mathematical THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 

piece-by-piece continuous and quadratically integrable in the same 
interval. To approximate the function / by means of the first n members 
of the series we form the expression* 

n 

/n(^) ~ Cyllyj ( 25 * 8 ) 

where the coefficients c are undetermined constants. The error of this 
approximation is 

n 

“■/ fn f CvUyj (25*9) 

1^*0 

and for tlie mean square of the error we may use the expression 

p^O 

To minimize this function of the coefficients Co, ci, • * • , Cn we differen- 
tiate with respect to the real and imaginary parts of each of the c’s 
and set the derivatives equal to zero. This yields the usual Fourier 
coefficient formula [cf. Eq. (22*5)] 

c. = £fu*dx S (/,«,). V = 0,1,2, ,n (25-10) 

The expression for Wn^ then reduces to 


P^O 

from which we deduce BesseFs inequality 

(/,/). 

If the approximation /„ can be made as accurate as desired by increasing 
the number of terms, it is necessary that 


lim Wn^ = (/,/) - V |c.|2 = 0. (25*11) 

When this condition is satisfied for an arbitrary / we say that the system 
of functions Uoj wi, W 2 , • * • is complete [cf, Eq. (22*30)]. If the condition 
is satisfied, any function orthogonal to every member of the system must 
vanish identically, for in the case of such a function the c^s are all zero 

and - 0. Clearly, if we remove any member of such a complete 

system the remaining system will be incomplete. 



Sec. 25] COMPLETENESS OF SYSTEM OF EIGENFUNCTIONS 137 

In the case of a system of functions which is not directly orthogonal but 
orthogonal with respect to an essentially positive weighting function 
p{x)^ it is convenient to introduce a modified definition of completeness. 
Let us suppose, for example, that the series of functions Wo. 2 ^ 2 , * * • 
forms a normal orthogonal system in the sense that 

— 8ij. (25*12) 

Let the accuracy of the approximation fn be tested by the weighted 
mean square 

n 

WJ = (ptn,L) = (p/,/) — ^[c.(m.,p/) + c,*(pf,u,) — CyC,*]. 

WJ is a minimum when the coefficients are determined by the modified 
Fourier-coefficient formula [c/. Eq. (23*13)]. 

Cp — i,pfjUf). (25*13) 

If the approximation can be made as accurate as desired 

00 

lim WJ = (p/J) - X (25*14) 

and we again say that the series of functions Uo,ui^U 2 y * * * is complete. 

Consider next the set of eigenfunctions 2 / 0 , 2 / 1 , 2 / 2 , * • * obtained by 

solving a Sturm-Liouville problem subject to boundary conditions which 
permit us to throv' the problem into variational form. Let us further 
assume that Xn+i is the minimum value of Q/N for functions which are 
orthogonal to p2/o, P2/i, P2/2, ' ' ' y PVn and that {cf. Sec. 23) 

lim X„ = 00 . (25*15) 

w— » 00 

Under these circumstances the conditionally orthogonal system of eigen- 
functions is complete. The following proof is a modification of the one 
given in Courant-Hilbert, Kap. VI, §3. 

Although the definition of completeness can be extended to cover a system of func- 
tions which are not normalized, we assume that the eigenfunctions under discussion 
have been normalized in accordance with Eq. (25-12) . We also make the (provisional) 
assumption that the arbitrary function /(x) satisfies the same boundary and continuity 
restrictions as the admissible comparison functions for the variation problem. The 
error function tn is defined as before by the equations 

n 

in ^ f - ^ CvVp. Cv = (p/,2/i») (25‘16) 

V ««0 

It follows from the above definition that 

(p2/0^n) = Unypyi)* = 0- 


i = 0, 1, • ♦ • , n 


(2647) 



138 MATHEMATICAL THEORY OF ORTHOGONAL FUNCTIONS [Chap. IV 
Hence 

= 0. i « 0, 1, • • • , n (25-18) 


As A is self-adjoint, while / and in satisfy the boundary conditions, 

(A2/i,W - {ViMn) - V{b)[tn\h)ym - tn*\h)yi{h)\ 

- v{(I){tn*{a)yi\a) - tn**{a)yi{a)\ = 0. 


Thus 


(A^n,^i) = iViAinY - 0. (25-19) 

But since X»+i is the minimum value of Q/N for functions orthogonal to pt/o, pt/i, 
P2/2, • • * , P2/n, 


\ ^ Q[in] _ (A/n,^») 

^ N[tn] (ptnM 


(25-20) 


It follows that as 7i becomes infinite either (Ai»,<„) must approach — oo , or else {piny in) 
must approach zero. To ascertain the behavior of the numerator for large values 
of n we fonn 


n n 

(A/j/) ~ (A<n,^n) — |Ci»]*Xi/ -f- [c»»*(A<n,?/i') “h Ci/(A2/v,fn)l. 

I»Bt0 l»— 0 

The second summation vanishes due to Eqs. (25-18) and (25-19): Hence (A/n,<n) is 
the sum of a term which is independent of and of another which increases with n 
when n is large. We conclude that 

lim iMnytn) , 

n — ► « 

and that consequently 

lim (ptnytn) - lim wA = 0. (25-21) 

n-* 00 n— ♦ 00 

This proves the completeness of the system of eigenfunctions insofar as functions / 
satisfying the boundary conditions are concerned. For the removal of this restriction 
and the extension of Eq. (25-14) to all piec^e-by-piece continuous quadratically integra- 
ble functions see Courant-Hilbert, Kap. VI, 53. 

It follows from the definition of completeness that the series 


2 / ““ ^pf > 2 ^") 

y-O 

converges on the average in the interval a,l). Since the mean square of 
the error function for an n term approximation approaches zero in the 
limit, we say that the completeness of the sequence of functions 

ao 

implies that the series ^ Cyyv{x) converges ‘‘in the mean square'^ to the 

limit function /(*). An investigation by Weyl cited in Sec. 30 indicates 
that if fix) satisfies all the restrictions placed on the comparison functions 
of tiie variation problem, the series will not only converge in mean square, 



Sec. 26] COMPLETENESS OF SYSTEM OF EIGENFUNCTIONS 139 

but will converge uniformly at every point to the value Hence 

we shall assume for such functions that 


Six) = 2 <j>f,y.)y,{x). (26-22) 


If the functions yv(x) are not normalized, the expansion becomes 


fix) = 2 



(25-23) 


Our proof of the completeness theorem for the eigenfunctions of a 
Sturm-Liouville problem is limited to the case in which the discrete 
eigenvalue spectrum extends to + . In S(ic. 30 a modified statement 

of the theorem is formulated which applies when there is a continuous 
spectrum of eigenvalues extending to infinity and replacing or supple- 
menting the discrete spectrum. In such cases, also, there is a convenient 
expansion theorem applicable to physically admissible functions and 
containing an integral over the continuous spectrum of B type eigen- 
functions in addition to the sum over the discrete functions. 

In mentioning these expansions, however, it should be remarked that 
in the last analysis they are less general than the completeness relation 
itself — (25*11) holds for an arbitrary quadratically integrable /(x) — and 
so less necessary. We shall return to tliis point in Secs. 30 and 36. 

As a corollary on the completeness theorem we have the following 
formula for the scalar product of two functions /(x), g{x) which are 
piece-by-piece continuous and quadratically integrable. 


ipf,g) = d, = (Pi7,2/.) (25-24) 

^ = 0 


[cf. Eq. (22-32) ff.]. 

' Courant-Hilberl givo a proof of (25*22) that covers all cases in which the funo 
tions yp(x) are bounded and the interval o ^ x ^ h is finite {cf. Courant-Hilbert, 
pp. 370-371) 



CHAPTER V 


THE DISCRETE ENERGY SPECTRUM OF THE TWO-PARTICLE 
CENTRAL-FIELD PROBLEM 

26. THE BEHAVIOR OF SOLUTIONS OF AN ORDINARY SECOND-ORDER 
DIFFERENTIAL EQUATION NEAR A SINGULAR POINT 

If tho end points of the fundamental interval a < x < & of a Sturm- 
Liouville problem are singular points of the differential equation,^ it is 
of the greatest importance, both for proving that the problem has solu- 
tions and for finding the solutions, to know some of the results of the 
general theory of the behavior of solutions of second-order linear homo- 
geneous differential equations in the neighborhood of such points. ^ 
In stating the results of the theory we restrict ourselves a priori to the 
case in which the coefficients in Eqs. (23*4) and (24*1) are single-valued 
and analytic except at certain isolated, irnunovable singular points. 
(As the average reader knows, a function y{x) is said to be analytic 
at a point x == a provided that it can be developed in a Taylor’s series, 

^a„(x — a)”, which converges in the neighborhood of that point. 

n “O 

The function y{x) is said to have an isolated singular point at x = x' 
if it is analytic at all points in the neighborhood of x' but is not analytic 
at x' itself.) 

Let us now divide Eq. (23*4) through by po{x)f rewriting it in the 
standard form 


It can be proved that when the equation is written* in this form the 
solutions are all analytic wherever the coefficients are analytic [c/., e,g., 
Bieberbach, loc. dt.j (p. 207)]. But since the zeros of Pq{x) generate 
poles in the new coefficients pi and p 2 , it will be seen that the singular 
points of the differential equation, as defined in Sec. 17, are now simply 
singular points for one or the other of the coefficients. In the neighbor- 

^ Cf. footnote 1, p. 79. 

* Cf. E. L. Ince, Ordinary Differential Eqimtions, pp, 160-168, 356-370, London 
and New York, 1927; L. Bibbbbbach, Differentialgkichungen, 3d ed., pp. 206-219, 
1930; Ribmann-Wbbbb, D.P., pp. 248-264. 

140 



Sue. 26] SINGULAR POINTS OF A DIFFERENTIAL EQUATION 141 

hood of an ordinary {i.e., nonsingular) point of the equation, say x = a, 
every solution is expansible in a convergent series 

QO 

y = 2 ) (26-2) 

Solutions can also be analytic at the singular points of the equation in 
some cases, but, as previously noted, have a tendency to become dis- 
continuous at such points. 

When the development (26*2) does fail, the next possibility in order of 
simplicity is an expansion of the form 


y ^ {x— ay^a^(x — ay, ao 5 ^ 0 (26*3) 


It can be proved that at least one solution exists in the neighborhood of 
X = a for which the expansion (26-3) is possible, provided that the point 
in question is a pole for one, or both, of the coefficients pi{x) and p 2 (x) of 
order not greater than one; for pi and two for p 2 . In other words, a 
solution of the form (26*3) exists, if Eq. (26* 1) can be written in the form 


4 - in 4 - ^2(3? - a) ^ ^ p. 

dx^ X — a dx (x — ay ^ ^ 


(26-4) 


where Pi and P 2 are analytic at x = a and hence expansible in power 
series about that point. When this condition is fulfilled the point x — a 
is said to be a regular smgiilar point {^*Sielle der Bestimmtheit”). Other- 
wise it is an irregular singular point. 

To oWtain the solution in the neighborhood of a regular singular point, 
one may expand Pi and P 2 in the series 


Pi = - ay, Pi = - ay, 

V — 0 i> * 0 

insert the right-hand member of Eq. (26*3) for y into Eq. (26-4), and 
collect terms involving like powers of {x — a). The sum of the coefii- 
cients of each such set of terms should vanish. The series of equations 
obtained from this condition determine both the exponent r and the 
relative values of the successive coefficients a„. From the terms of 
lowest power one obtains the equation for r, viz.j 

r(r — 1 ) + rfio + 70 = 0. (26-5) 

This is known as the fundamental equation or indicial equaiion for the 
singular pbint. If the roots of this equation are distinct, and do not 
differ by an integer, there are two linearly independent solutions of the 
equation having the form (26*3), one for each root.^ 

1 Any solution of a second-order linear homogeneous differential equation can be 
expressed as a linear combination of two linearly independent solutions. A pair of 



142 DISCRETE SPECTRUM OF TWO-PARTJCLE PROBLEM [Chap. V 


On the other hand, if the roots of the fundamental equation are equal, 
or differ by an integer, the root Vi having the larger real part^ is to be used 
to give a solution, say yi, of the form (26*3) and the second linearly 
independent solution can be proved to have the form 

2/2 = y\A log {x — a) + (x — ay^ <p(x — a). (26*6) 

Here <p(x — a) is analytic at a; = a and does not vanish there. The 
exact form of <p may be determined like that of yi by expansion in series 
and the method of undetermined coefficients. 4 is a constant which is 
not zero if ri = r 2 , but can vanish when ri — r 2 is any other integer. 

When the point a is an irregular singular point the solution in its 
neighborhood is more difficult to handle, though convergent series involv- 
ing an infinite number of negative as well as of positive powers of x 
are known to exist. Asymptotic but ultimately divergent series involving 
only a finite number of negative powers of x also yield useful njprosenta- 
tions of the solutions of differential equations in the neighborhood of 
such points. 2 

The Irregular Singular Point at Infinity. — If the transformation 

X 2 /(^) « u(z) (26-7) 

yields an equation for u(z) with a singular point at the origin, we say that the equation 
for y(x) has a singular point at infinity. If u(z) has a regular singular point at the 
origin, y{x) is said to have a nigular singular point at infinity, and the behavior of y{x) 
for large values of x is determined by the behavior of u(z) for small values of z. 

Consider, for example, the linear oscillator problem of Sec. 18. After applying 
the transformation (26-7) to (18*2), we obtain % 

“" + ¥ + ® 

as the differential equation for u{z) = y(l/z)^ On account of the term in u'(z) the 
origin is necessarily a singular point. The condition for a regular singular point is 
that [E — shall be analytic at 2 ! * 0 , and cannot possibly be satisfied for 

all values of E. 

Fortunately we have already derived in Sec. 19 and Appendix C the essential 
information regarding the behavior of solutions of the linear oscillator equation at 

linearly independent solutions is called a fundamental system. From any one such 
system an infinity of others can always be formed by making linear combinations. 
The problem of solving a differential equation in the neighborhood of a point x ^ a 
may be regarded as finished when two such linearly independent particular solutions 
are found. 

^ This rule covers the general case where /3o and 70 may bo complex. If they are 
reali the roots of Eq. (26-5) are conjugate complex quantities, or else real. In the 
former case, or in the case of equal roots, either one may be used. 

* Cf. L. ScHLBSiNGBR, Einfiihrung in die Theorie der gewdhnlichen Differentialglei-- 
ckimgent 3rd ed., Kap. 8, Berlin, 1922; L. Bibbbrbach, IHjferentialgleidiungent 3d ed„ 
Abschnitt 2, Kap. 4, §10, Berlin, 1930. 



Sec. 27] 


THE LEGENDRE POLYNOMIALS 


143 


infinity. Thus we know that if F (x) approaches a positive limit greater than E when 
X approaches infinity, or if V {x) becomes positively infinite as any finite power of a:, 
there is one and only one integral curve for a given value of E which satisfies the 
singular-point boundary condition at a; * <«. On the other hand, if [V{x) — E\ 
approaches a finite negative limit as x becomes infinite, the curves all continue to 
oscillate as x increases, and none of them satisfies the singular-point condition, 
although all remain finite and conform to the continuity condition of Sec. 17. We 
shall return to the discussion of the properties of this latter class of integral curves 
in Sec. 30. 

27. THE LEGENDRE POLYNOMIALS 

27a. General Properties of the Legendre Equation. — AwS an example 
of the application of the mathematical theory developed in Secs. 22 to 26 
lei, us (consider the special Sturm-Lioiiville differential equation 

in the interval — 1 ^ 1. The equation has singular points at 

37 = ±1, ± 00 . In order to examine the point x — +l we throw the 
equation into the standard form (26*4) and find that 

Both of these functions are analytic at a: = 1. Hence this is a regular 
singular point. 

The constant terms in the expansions of Pi and about x = 1 are 
^0 = 1, To = 0? 

respectively. The fundamental equation reduces to 

r(r - 1) + r = r2 = 0. (27-2) 

As the roots are both zero, the fundamental system of solutions yi, 
in the neighborhood of x = 1 reduces to the form 

ao 

yi = ~ 

vO 

00 

y 2 = yiA log (1 - x) + 5 5,(1 - x)'. 

Evidently yi is analytic at x = 1. Furthermore, it cannot vanish there 
since substitution of the power series (27-3) into the differential equation 
yields the recurrence formula 

-X + v{v + 1)1 

2(>- + D* J’ 



(27-3) 

(27-4) 


a,+i 


(27-5) 



144 DISCRETE SPECTRUM OF TWO^PARTICLE PROBLEM [Chap. V 


from which it follows that if yi vanishes at a; = 1 (f.e., ao == 0), it must 
vanish identically. 

yi conforms to the singular-point boundary condition (23-17) at 
a; = 1 for all values of X with the constant m set equal to zero. It has 
a finite number of nodes in the neighborhood of the boundary point in 
question. 

The symmetry of the differential equation with respect to the trans- 
formation a; — ^ — X shows that the point x = — 1 is also a regular singular 
point with the same fundamental equation (27-2). We conclude that 
for all values of X there are integral curves which satisfy the singular- 
point condition at x = —1 and have a finite number of nodes in the 
neighborhood of that point. 

Thus there is an interval G extending from X = --oc>toX== + <» and 
conforming to the requirements of the initial existence theorem of 
Sec. 23e. The theorems of Sec. 23e show further that there is a minimum 
eigenvalue with a nodeless eigenfunction and that the infinite sequence of 
ascending eigenvalues satisfies the condition 

lim Xn = 00 . 


It follows from Sec. 236 that the eigenfunctions are mutually orthog- 
onal, and from Sec. 24 that the eigenvalues are the stationary values of 
Q/iV, where 


+1 

Q = /(I - x^)\y'\‘dx, 
-1 



^dz 


and the boundary conditions are of the singular-point type.- Since there 
is a nodeless eigenfunction, Q/N must have a minimum value. Direct 
examination shows that Q and N are essentially positive, so that 



Finally the eigenfunctions form a complete orthogonal system. 

27b. Explicit Determination of Eigenvalues and Eigenfunctions. — 
The eigenvalues of X can be derived from the series (27-3). Let Up denote 
the pth term of this series. Since 





X 


the boundary point x — — 1 is the limit of convergence unless the series 
terminates with the nth term due to the circumstance that 


X ~ n(n + 1). n ^ 0, 1, 2, 3, 4, • • • (27-6) 

Let us assume that the series does not so terminate. Then for sufficiently 
large values of p, say p > A, we can neglect X in comparison with p(p + 1). 



Sec. 27] THE LEGENDRE POLYNOMIALS 

It follows from (27*5) that to this degree of approximation 


145 


2^hah =* + l)«A+i = • * • = 2^vav =s • • • 


i V 


OLy 


constant 


Thus the higher members of the series approximate to the corresponding 


members of the expansion of log 



in powers of (1 — ^)/2, 


and the sum of the series must approach infinity, as x approaches the 
point —1 in such a way that the singular-point boundary condition is 
not satisfied.^ We conclude that the values of X given by Eq. (27*6) 
are the only ones which permit solutions of our problem and arc hence the 
desired eigenvalues. 

The eigenfunctions are polynomials which can now be derived from 
Eqs. (27*3) and . (27*5). A more convenient procedure, however, is to 
use a power-series expansion about the mid-point of the fundamental 
interval; 


y = 2 ) 6 ^'. 

V 


If we substitute this series into the differential equation (27*1), give X 
one of its eigenvalues, and equate the coefficient of each power of x to 
zero, we obtain the recurrence formula 

^ + 1) - n(n + 1),^ 

- (^ + 1)(^ + 2) 

This shows that the odd and even powers of x form independent series 
as in the linear-oscillator problem of Sec. 20. If n is even, the even 
series terminates; if n is odd, the odd series terminates. The eigenfunc- 
tions are evidently the polynomials obtained by suppressing the inde- 
pendent infinite series. Apart from a constant normalizing factor they 
are identical with the well-known Legendre polynomials, or zonal spherical 
harmonics. The latter are defined by the formulas 


Po(x) = 1; P,(x) = n = 1. 2, • • • (27-7) 

A summary of the most important properties of the Legendre poly- 
nomials is given in Appendix F, together with explicit formulas for the 
early members of the series. 

* Cf. Cou^nt-Hilbbbt, M.MJ*,, p. 281. 



146 DISCRETE SPECTRUM OF TWO-^PARTICLE PROBLEM [Chap. / 


28. THE ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 


28a. The Wave Equation. — The problem of two interacting particles 
is the fundamental eigenvalue problem of the Schrodinger theory. We 
assume that the potential energy V depends only on the distance r 
between the two particles. Then, as in Sec. 15, it is convenient to 
introduce the coordinates of the center of gravity and the components 
of r as independent variables. Let Xi, yx, X 2 , 2 / 2 , ^2 be the absolute 
coordinates of the two particles and let x, t/, z be defined by 

aj = a ;2 - xi, 2/ = 2/2 - 2/ 1 , z ^ z^ - Zx, [Cf. P^qs. (15T4)] 


From the general equation (17T) we inay deduce the wave equation for 
the internal motion of the system when there is a definite internal energy 
E, It is 


4- 

l\dx^ dy^ 
where m is defined by 

1 


+ 


+ 

1 

11 

0 

(28-1) 

M 1 ^ 1 

filHi in lit 

(28-2) 


This is also the wave equation for a single particle of mass /x moving 
under the influence of a fixed center of force at the origin of the Xyy^z 
system of coordinates. 

28b. Separation of the Variables. — ^To solve Eq. (28T), introduce 
spherical coordinates r, 6j Direct transformation then yields 



+ 


_1 

r® sin d d$ 



+ 


1 ay 

sin d(p^ 


+ k(E 


y)^ - 0. ‘ 

(28-3) 


Here k denotes the quantity (87rV)/A® as usual. 

The variables can now be separated as in the derivation of Eqs. (15T7) 
and (15T8). We seek particular solutions of the form 




(28-4) 


Inserting this expression into Eq. (28-3) and rearranging, we obtain 

One side of this .equation is a function of r only, while the other side 
d^nds upon the polar angle 6 and the azimuthal angle <p, but not on r. 
Hence the two sides must have a common constant value which we shall 

^ 8 is the angle between the radius vector and the z axis while ^ is the azimuthal 
angle between the xz plane and a plane through the radius vector and the z axis. 
(Cf^ CouaANOvHiiai&ar; p, 184, or RiBwcANK-WicnBB, B.P.t Vol. I, p. 76.) 


( _ 1 Id/. dY\ 1 a^r 

( Y sin e ^ ee)'^ sin*^ 



Sbc. 28] ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 147 
call X, and the equation is equivalent to the pair of simultaneous equations 

ao f»("” “ ») + am 0 + >'>' - »■ 

Equation (28*6) plays an important part in potential theory^ where it 
must be solved in such a way that Y is single-valued and continuous 
over the entire sphere. Such solutions are called surface harmonics. 
The same requirements apply to our problem. 

The variables in Eq. (28*6) may be separated in turn if we ask for 
special solutions of the form 


Y = e(ff)'i>(^). 

(28-7) 

The equation then breaks up into the pair of equations 


1 d/. .deX (30 ... . 

“ a jSI ^ jS ) =—22 +"XO = 0 

sm 6 d6\ do/ sin^S 

(28-8) 


(28-9) 

28c. The Azimuthal Factor of the Wave Function. — If yl/{x,yyz) is to 
be a single-valued function of the Cartesian coordinates, it follows that 
^{(p + 2 t) = ^{(p). In view of this condition the only admissible 
solutions of Eq. (28*9) are the familiar harmonic functions 


! sin mipy 

cos mipy m = 0, ±1, ±2, • • • (28*10) 

Thus the eigenvalues of ^ are 0^, 1^, 2®, 3*, • * * , In this case, due to 
the special boundary condition, there are two linearly independent 
solutions for each eigenvalue — a phenomenon called degeneracy. 

The normalization condition of Sec. 8 requires that the integral of 
over all space shall be unity. In spherical coordinates this means that 

RR^r^drJjoB* sin 6 do = 1. (28*11) 

Without loss of generality we can and will require that the separate factor 
integrals shall be normalized to unit values: 

fjRR*rHr = sin Bd9 == = 1. (28-12) 

Adopting the exponential form of the 4> functions and introducing the 
appropriate value A, we obtain 

«!>„(,») = (2»r)-H (28-13) 

‘ CouBANT-Hn-BBET, M.M.P., pp. 272, 441; Ribmank-Wbbbk, DI’., Vol. I, p. 309. 



148 DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chap. V 

28d. Determination of 0(^) and Its Eigenvalues. — Having obtained 
the eigenvalues of jS, we can proceed to the solution of Eq. (28-8). Insert- 
ing for and introducing x for cos ^ as a new independent variable, 
we reduce the equation to the form 

+ + '«-»■ 

This is to be solved in the interval — 1 < a; < +1 subject to the require- 
ment (Sec. 17) that the product shall be continuous and twice 
differentiable with respect to x^ y, and z at every finite point except the 
origin. We shall begin by seeking solutions which are merely continuous 
everywhere (the line 6 = contains the only points where continuity 
comes in question) and will show later on that solutions of the differential 
equation meeting this mild requirement are not only twice differemtiable 
but actually analytic everywhere except at the origin and at infinity. 

In the special case that m = 0, Eq. (28T4) reduces to the equation of 
the Legendre polynomiajs (27T). The continuity requirement is 
satisfied if we identify 0(x) with one of these polynomials, say Pi(x), 
so that the functions F/,o(^,^) = Pi(cos ^)4>o(^) solve the problem of 
Eq. (28*6) with the eigenvalues 

X - Z(Z + 1). « = 0, 1, 2, 3, • • • (2815) 

We next search for solutions appropriate to other values of m. The 
end points of the interval — 1 < < +1 are still regular singular points 

and the fundamental indicial equation (26*5) for the point a; = +1 
reduces to = (m/2) 2 . The roots of this equation differ by an integer 
so that one solution of any fundamental system is sure to become infinite 
at this point. The solutions which do not become infinite vanish at the 
end points because the larger root of the indicia! equation is positive. 

A solution of Eq. (28*14) which satisfies the continuity requirement is 
obtainable from each Legendre polynomial of order greater than, or 
equal to, |m|. To prove this substitute Pi{x) for y in equation (27*1) 
and the corresponding eigenvalue + 1) for X. Differentiating with 
respect to x and making the substitution 

Pi,,{x) = (1 - x^)y^ pax), 

we obtain 

s[<*’ - - »■ 

Repeating this process t times we arrive at the equation 

- »' <»*«> 

PM a (1 - x^y^Piix) = 6). (28-17) 


in which 



Sec. 28 ] ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 


149 


Since Eq. (28*16) is identical in form with Eq. (28*14) when r is identified 
with |m|, the functions Pz,r form eigenfunctions of our problem with the 
eigenvalues previously obtained for the case where m is zero, provided 
that they satisfy the boundary conditions. 

As Pi,r vanishes at ^ = 0 and ^ == ir, it is evident at once that the 
product = Pz,|^,(cos B)^rn{(p) is continuous over the entire 

sphere. Furthermore it is not difficult to prove that the product 
is a homogeneous polynomial of degree I in the Cartesig^n 
coordinates x, zA As such it is an analytic function of these variables. 

It follows that if E(r) is any function of r analytic in the interval 
0 < r < 00 , the product R(r)Y{6^<p) is an analytic function of Xy z 
except at the origin and infinity. This means that the functions Y i,m(By(p) 
satisfy all requirements. 

The functions Pi,t{x) are called associated Legendre functions and the 
products 

are called tesseral harmonics of the Zth degree and the mth order. ^ They 
form a special factorable type of surface harmonic. 

28e. Completeness of System of Eigenfunctions. — Equation (28 14) 
is in the standard Sturm-Liouville form with equal to (1 — x^) and 
p{x) equal to unity. The solutions and their first derivatives are finite 
at the boundary points where p{x) vanishes.# Hence Pi,t{x) satisfies 
the singular-point boundary conditions of Sec. 23d as well as the physical 
boundary conditions. Solutions of the differential equation which do 
not satisfy the s.p.b.c. are discontinuous at x ^ ±1. There is an 
interval 6 of X values such that for each X in the interval there is a pair 
of integral curves u\(x) and v\(x) which conform to the s.p.b.c. at x = +1 
and a; = — 1, respectively. This interval extends from ~ oo to 
and we infer from Sec. 256 that the totality of normalized eigenfunctions 
of Eq. (28*14) for any fixed value of m, solved subject to the s.p.b.c., forms 
a complete orthogonal system. 

By Sec. 23 the nth eigenfunction in a series arranged according to 
increasing eigenvalues must have n — 1 nodes in the interior of the 

* 0) is a homogeneous polynomial of degree Z — r in x, 2 /, 2 , 

being made up of terms of the form (cos + 2/* + 2 - 2 ) 

Furthermore 

r’’(8m * (» ± tyf 

is a polynomial of degree r. The product of these polynomials is a homogeneous 
polynomial of degree Z, viz., if m » ±r. 

*C/. W. E. Bterly, Fourier*8 Series and Spherical HarmonicSy pp. 195-199, 
Boston, 1893. Some authors define Pz,r as Z!/(Z r) ! times the value we have given. 



150 DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chap. V 


fundamental interval — 1 < a; <+l. Hence, if we arrange the eigen- 
functions we have already found in a similar series and discover that 
its nth member has no more than n — 1 nodes, we can be sure that we 
have found all the eigenfunctions there are. But it follows from the 
definition of Pz.t(^) given in Eq. (28*17) that I cannot be less than r. 
Hence the ordinal number of Pz,t in such a series is — r + 1), and since 
dlPi{x)/dx‘^ is a polynomial of degree Z — r, Pz.r cannot have more than 
I — r nodes in any interval. We conclude that the series of eigenfunctions 
Fi,rfor m = ±T with Z == r, r + 1, r + 2, * • 'is complete. 

Degeneracy. — As the (Z + l)th derivative of Piix) is zero, there are 
just Z + 1 solutions of Eq. (28*8) corresponding to the same value of 
X, viz.f \i = Z(Z + 1), and associated with the Z + 1 permissible values 
of m^). If /3 is zero, there is just one 4> function, viz., 4>o = (2ir)“^'^. 
All other values of yield two linearly independent # functions. Thus 
the Zth eigenvalue of the Y equation, viz., \i = Z(Z + 1), has 2Z + 1 
linearly independent solutions which are factorable into products of the 
form 0f>. It is not difficult to show that the set of all possible 
product functions is complete, so that the total number of linearly 
independent solutions for \i is 2Z + 1- We say that Xz has (2Z + l)-fold 
degeneracy. 

The more important properties of the associated Legendre functions 
are summarized in Appendix F. Equation (F15) permits us to normalize 
the 0 functions. Denoting the normalized function derived from 
P«.t(^) by 0z.m wo have^^ 


OiUO) = 


(21 + 1)(Z~ 





(cos B), 


r = \m\, 


(28*18) 


2(Z + t)! 

28f. Physical Interpretation of Qurntum Numbers Z and m. — We 

have now located the eigenvalues of X ami d.otermined the associated 
angle functions. We proceed to the study of th(i radial wave equation 
(28*5). Substituting the value of X from Eq. (28*15) and changing the 
dependent variable from R to (R = rP [cf. Eqs. (23*19) and (23*20)], we 
obtain 

3? + 4 ® - 

an equation identical in form with (18-2), but with an effective potential 
energy 

V.(r) = F(r) + 




(28-19) 




(28-20) 


substituted for F(r). If we make the further substitution 


ZuiS r 

(R — e * 

i Qraphs of the normalized O’s are given in Condon and Morae, Quantum Mechanics, 
Fig. 6, p. 55. 



Sec. 28] ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 


151 


and the approximations of Sec. 11, we derive the relation 

')’ + ’'w + 

which is the same as the classical radial component of the Hamilton- Jacobi 
equation with l(l + l)h^/{4T^) substituted for the square of the classical 
angular momentum. This result suggests that in the wave mechanics 

—VkT+T) is the value of the angular momentum, an interpretation 

which will be justified in the next chapter from another point of view. 
As l{l + !)==(/ + 3^)^ — 34> expression is very similar to the 

expression (Z + for the angular momentum in the Bohr theory 

with half-integral quant\im numbers, but permits a state of zero angular 
momentum forbidd('n in the latter tlieory. In view of this partial 
agreement with the Bohr theory we adopt the terminology of that theory 
and refer to the integer I as the azimuthal quantum nur(iber. (In the Bohr 
theory I is associated with the azimuthal angle in the plane of the classical 
orbit.) 

A similar relationship exists between the integer m of the present 
theory and the so-called magnetic quantum number of the Bohr 
theory. The latter, when multiplied by A/(27r), roprescmted the com- 
ponemt of angular momentum in the direction of the z axis which was 
taken to be that of an incipient magnetic field. Hence the ratio 
m 


I + 


or m/k gave the cosine of the angle between the normal to the 


orbit and the z axis. The angle 0 between the radius vector and the z 
axis for such a Bohr orbit ranges accordingly from ~-sin~^ C + h) to 

+sin~‘ (r^s)- The same sharply defined range of values for 6 is 

obtained from the present theory, if one uses large values of Z. If Z — m 
is large, the nodes of 0 are closely spaced and it is convenient to adopt 
the procedure of Sec. 21f, replacing the probability 0^ sin d dd that the 
particle lies in the range d$ by its mean value over the interval between 
successive nodes. In the limit the probability is^ 


sin edS^ 


constant X |sin 9\d9 




» + Hr 


^0, 


if sin*0 > 


if sin^e < 


- H 

(i"+ H)* 

w* — ^ 

(t+mp’ 


Changing the dependent variable in (^-8) from Q to U, where U =• Vsin^'G, 
we obtain the differential' equation 



152 


DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chap. V 


Figure 9 shows 9^ sin Q plotted against B for the case / = 10, m = 5, and 
for comparison the smoothed function 0® sin 6, The concentration of 
probability in a limited range of polar angles is, of course, greatest 
when in = I so that 0^ is proportional to (sin 0)^^ 



Fig. 9. — Pr(3bability distribution for the polar angle 0 . 

28g. Behavior of Radial Wave Functions at Boundary Points. — 

Before introducing an explicit potential function let us examine the 
behavior of the integral curves of the radial wave equation (28* 19) 
at the boundary points and verify the equivalence of the physical bound- 
ary conditions and the s.p.b.c. for this Sturm-Liouville problem. We 
assume that V{r) approaches a definite limit as r becomes infinite, and 
that for small values of r the Coulomb inverse square law of force applies. 
Then 

^ + <p{r), (28-22) 

where a is a constant and is regular at r = 0. 


This is of the form of (18-2) so that an approximate solution for large values of \ is 
obtainable by the B. W. K. method. The ‘‘wave length” of U with respect to the 
independent variable 6 is 2-ir|sin d\ [(X -f sin*^ — -h We denote this 

rt — 1^ 

quantity by fx{6). It becomes imaginary when sin*^ < Hence U is oscil- 

X i- >4 

Vi m* — M 

latory when sin*0 > ^ ~ fades abruptly to zero outside this 

limit. The B, W. K. approximation (21*7) takes the form 


U ^ constant X 'S/MB) coa^27rJ* 

Replacing (cos2,/f)* by its average value for the interval between successive 


nodes, one obtains 


e> sin ^ 9^ I7» S 


constant X sin 





Sec. 28] ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 153 

If E is greater than the function (R = i2r of Eq. (28-19) will 
approximate a sine curve for large values of r and cannot be made 
quadratically integrable {cf. Sec. 19c, p. 85). Neither the physical 
boundary conditions nor the s.p.b.c. can be satisfied and we conclude 
that there are no type A eigenfunctions for energies greater than V oo. 

Next assume that E < It follows from Sec. 19 and Appendix C 
that for each such value of E there are integral curves of Eq. (28-19) 
which, together with their derivatives, approach zero monotonically 
and exponentially ’’ as r becomes infinite. Such integral curves satisfy 
the s.p.b.c. at infinity and yield solutions of Eq. (28-5) by the transforma- 
tion R == (Si/r which again satisfy the s.p.b.c. at infinity. As the trans- 
forms R{r) are bounded in the neighborhood of infinity they also satisfy 
the physical boundary conditions for type A functions. Conversely, 
any integral curve which does not conform to the type A conditions at 
infinity will not be quadratically integrable and cannot satisfy the s.p.b.c. 
at the outer boundary. We conclude that in the neighborhood of 
r = 00 the s.p.b.c. applied to solutions of either of the radial equations 
(28-5) and (28-19) are equivalent to the boundary conditions for type A 
functions. 

Let us next apply the theory of Sec. 26 to the inner boundary point. 
The point r = 0 is again a regular singular point. Using the symbol t 
for the exponent of the lowest power of r in (R(r), the indicial equation 
[cf. Eq. (26-5)] of (28-19) at the origin is 

t{t - 1) - l{l + 1) = 0 . 

Its roots are = Z + 1 and Z 2 = — Z. As these two roots differ by an 
integer we have to do with the logarithmic case of I]q. (26-6). The two 
linearly independent solutions given by Eqs. (26-3) and (26-6) yield 

= Ar^Uir) log r + 
r 

where /i and /2 are regular at r = 0 and do not vanish there. 

Ri and (Ri satisfy the singular-point boundary conditions for the 
equations (28-5) and (28-19), respectively — also the physical boundary 
conditions — while R 2 satisfies neither the s.p.b.c. nor the physical con- 
ditions. We conclude that in this case the physical boundary conditions 
for type A functions are equivalent to the s.p.b.c. at both inner and 
outer boundary points. The range G of energy values over which these 
conditions can be met at both ends of the fundamental interval extends 
from iS? = — 00 to The attention of the reader is called to the 

fact that both R 2 and Ri are quadratically integrable when Z = 0. Hence, 



154 


DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chap. V 


the mere requirement of quadratic integrabiJity would not siifEco to 
pick out the desired orthogonal set of wave functions. 

Let ni" denote the number of interior nodes of an integral curve of 
(28-19) for the energy E — and for the azimuthal quantum number 1. 
Then it follows from Sec. 23 that there will be n/' discrete eigenvalues of 
E below w/' may be zero in the case of an effective potential energy 
with a very shallow minimum or none at all, or it may be infinite. The 
Sommerfeld phase integral of Eq. (21T0) evaluated for E — gives 
an approximate means for evaluating n/' from which we see that if 
F — F^ is negative and varies as 1/r for large values of r, the number of 
discrete eigenvalues is infinite independent of the azimuthal quantum 
number Z. On the other hand nz" is finite and varies with Z if F — F„ 
varies as l/r^ or some higher inverse power of r in the neighborhood of 
infinity. 

Regardless of the number of eigenvalues, the sequence has an upper 
bound F », and hence the eigenfunctions do not form a complete system. 

In applications of the general theory to the vibration and rotation of 
diatomic molecules and to single-electron atoms the number of interior 
nodes of the radial eigenfunction is called the radial or vibrational quantum 
number. In molecular theory it is usually designated by the symbol v 
which we here adopt, t; -f 1 is the ordinal number of any eigenvalue 
of the energy when arranged in an ascending series together with the 
other eigenvalues for the same azimuthal quantum number Z. In one- 
electron atomic problems the so-called ^Hotal quantum number^' 

n = v -f- Z + 1 

is introduced instead of v because of its major influence on the energy. 
The associated Legendre functions 0zw have I m nodes and the real 
and imaginary parts of have each 2m nodes. However, if the spherical 
harmonic Yi,m = 0zm4>m is laid out on the sphere we see that the 2m 
nodes in the real part of may be united in pairs to form m nodal 
meridian planes. Thus Yi,m may be said to have Z nodes in all, while the 
complete wave function 

^nlm ~ RnlOlm^m 

has i; + Z = n — 1 nodes. These divide the space around the origin 
into cells in each of which the phase of the vibration for, say, the real 
part of is opposite to that of its neighbors. 

As previously noted there are 2Z + 1 linearly independent factorable 
solutions of the Y equation (28-6) for each value of Z. This degeneracy 
carries over to the general equation (28-3) which has 2Z + 1 linearly 
independent factorable solutions for each energy level. It is due to the 
spherical symmetry of the problem and corresponds exactly to the 
zniiltiplicity of orbital orientations and arimuthal quantum numbers in 



Sec. 28 ] ENERGY LEVELS OF THE TWO-PARTICLE PROBLEM 


155 


the Bohr theory for the same system. As th(^ orientation of the axes 
in space is arbitrary, the independent factorable solutions may be 
chosen in an infinity of different ways. 

As explained in Sec. 21/i, we can compute approximate eigenvalues 
of Eq. (28*19) by the B. W. K. method, provided that we replace the 

/j2 

potential function F(r) by V (r) + quantum condition 

(21*10) then takcvS the form 

J{E,l) = = (n + M)A. 

n = 0,1,2, • * • (28*23) 

In other words we compute the radial momentum as if the angular 
momentum were given the value (I + l' 2 )hl 2 T instead of [/(/ + 

Thus it is to be expected on the basis of the work done in Sec. 21 that the 
Bohr energy-level formulas will give the correct energies to a close 
approximation provided that we use half-integral values of the azimuthal 
and radial, or vibrational, quantum num})ers. 

There are two important special cases to be considered, viz,^ (a) the 
problem of the hydrogenic, or one-electron, atoms — to be discussed in 
Sec. 29 — and (6) the problem of the nongyroscopic diatomic molecule. 
The latter system is obviously favorable to the application of the B. W. K. 
method on account of the large reduced mass of every diatomic molecule 
as compared with the electronic mass to be used in atomic problems. 
(The accuracy of the usual first B. W. K. approximation is inversely 
proportional to the integral /ir of Eq. (2T20) which, in turn, varies 
inversely with the square root of the mass.) 

28h. The Dumbbell Model of the Diatomic Molecule. — In applying 
the theory of the two-body problem to a diatomic molecule we make 
use of what is called the dumbbell’^ mathematical model. It neglects 
the details of the electronic structure entirely, assuming that the only 
effect of the electrons on the nuclear motion is through a modification 
which they produce in the effective mutual potential energy of the two 
heavy nuclei. The use of this model is justifiable on theoretical grounds 
for those ^‘electronic states'^ of the molecule which have no average 
orbital electronic angular momentum about the internuclear axis (cf. 
Sec. 47c). 

Of course it is not possible to work out a definite formula for molecular 
energy levels from Eq. (28*23) without definite information regarding the 
potential function F(r), and just this essential information is lacking. In 
principle, F(r) is theoretically predictable on the basis of a solution of 
the energy-level problem for the electrons moving under the influence of 
fixed nuclei. In practice we are unable to compute V (r) in this way with 



156 


DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Ohap. V 


any degree of accuracy except in the cases of neutral and ionized molecular 
hydrogen. Hence F(r) must be regarded as a priori a quantitatively 
unknown function. From a qualitative point of view, however, we can 
safely predict that V{r) will have the general form shown in Fig. 10 
with a single minimum at the intemuclear distance ro. The character 
of the curve shown in the figure is dictated by the following considerations. 

a. At very short distances the electronic contribution to V{r) must 
approach a finite limit which can be identified with the electronic energy 
of a single atom formed by uniting the two nuclei. As the contribution 
to F(r) from the Coulomb repulsion pf the bare nuclei becomes infinite 

at r = 0, F(r) itself must have a pole 
of the first order at the origin. Thus 
lim rF(r) = ZiZ^e^^ where Zi and Z^ are 

r --+0 

the atomic numbers of the two nuclei. 

6. At very large distances the force 
between^ two neutral atoms must 
approach zero more rapidly than r~^, and 
F(r) itself must approach a finite limit 
which can be set equal to zero without 
loss of generality. If a stable molecule 

Fto. lo.-Potoivtial energy and to be possible V(r) must have a mini- 
energy eigenvalues for model of dia- mum at Some intermediate point r — ro 
tomic molecule. r .u i j r x • 

of the order of magnitude of an atomic 

diameter. The absolute value of F(ro) constitutes what we may describe 
as the static dissociation energy of the molecule and is very nearly equal to 
the dissociation energy measured by chemical methods. 

c. It seems reasonable to assume that F (r) will be a relatively smooth 
curve with only the one minimum. This postulate finds support in 
approximate computations and in the analysis of empirical band spectra. 

The application of the quantum condition (28-23) to the evaluation of 
any energy level requires an accurate specification of the potential func- 
tion V(r) only over the classical range of vibration for the corresponding 
energy. Hence it suffices for the determination of the lower energy 
levels of the molecule that we have a description of F (r) which is accurate 
in the neighborhood of the minimum point ro. A power series in r — ro, 



or ( 


r — ro 

— y 

ro 


is obviously adapted to our purpose and in favorable 


eases should converge rapidly over the required range of values, 
possible to insert such a series, say 


It is 


Xr) * -F(ro) + + aj + + 


). 


into (^'23) and to evaluate as a power series in E — V(ro). 

Beversing the latter series, one obtains an expression for £ as a double 



Sec. 29] 


THE HYDROGENIC ATOM 


157 


power series in J and the coefficients being known functions of ro,a,5, 
• • ' Inserting (v )h for J, and dropping higher power terms, we 
obtain in crude first approximation 

E = F(ro) + (r + + (Z + + ••• 

p = 0, 1, 2, • • • , I = 0, 1, 2, ■ ■ ■ ' (28-24) 

where coo is the classical vibrational frequency of the molecule for small 

1 Ik 

oscillations about ro, viz.j The corresponding approximation for 

the integral form of Bohr theory is 


E 


= V(ro) + vho)o + 


1 %^ 


+ • • • . 


t; = 0, 1, 2, • • • 

I = 1, 2, 3, • • • 


(28-25) 


What may be described as the rotational spacing of the energy levels 
is obtained by holding v fixed and allowing I to take on successive integral 
values. According to (28*24) the successive intervals obtained in this 
way are in the ratio 2, 4, 6, • • • , whereas (28*25) gives intervals in the 
ratio 3, 5, 7, * • • . Observations on band spectra show that the former 
ratios are correct, at least for small values of 1. 

The vibrational spacing obtained by holding I fast and varying v is 
the same for both formulas and gives us no hold on the legitimacy of half- 
integral vibrational quantum numl)crs. However, if we take the energy 
difference between two levels belonging to different electronic states 
but with V and I in each case given their lowest values, we obtain a con- 
tribution /iwo/2 from (28*24) which is absent when we use (28*25). This 
contribution shows up when we compare the spectra of two molecules 
with chemically similar atoms but different nuclear masses. Thus it 
enters into the isotope effect in band spectra which has been shown by 
Mulliken to require the use of half-integral vibrational quantum numbers. 

To summarize, we may say that in the study of the rotational and 
vibrational levels of diatomic molecules we have a field in which wave 
mechanics gives essentially the same predictions as a modified Bohr 
theory with half-integral quantum numbers. These predictions are 
verified experimentally. 

29. THE HYDROGENIC ATOM 

29a. Application of the B. W. K. Method. — The most important two- 
body problem is that of the hydrogenic atoms. We assume a nucleus of 

1 Cf. E. C. Kemble, Proc. Nat, Acad, Sei, 7, 283 (1921); J,0,S.A., 12, 1 (1926) 
Another and equivalent procedure due to Kratzer is given in A. E. Ruark and H. C. 
Urey, Atoms^ Molecules, and Quanta, sec. 4, Chap. XII, New York, 1930, and elsewhere. 
Cf. also J. L. Dunham, Phps, Rev. 41, 721 (1932) for higher B, W. K. approximations. 
The author regards the criticism of the B. W. K. method by Rosenthal and Motz, 
Proc. Nat. Acad. Set. 23, 269 (1937) as unwarranted. 



158 DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chap. V 

charge Ze and a single electron under the influence of a Coulomb potential 
^€^Z/t. 

The B. W. K. method yields particularly good results in this case in 
spite of the small electronic mass, because the effective potential energy 
for the radial motion consists of two terms, one varying as and the 
other as It follows that the classical local momentum p has no 
zeros except those at the classical turning points. Hence we can choose 
the path T of Sec. 21h as the sum of a portion of the axis of imaginaries 
and a quarter circle of arbitrarily large radius drawn about the origin 
as a center. In this case pr " reduces to the very small contribution 
involved in getting away from the origin and the a priori ac^curacy of the 
procedure is high. Actually, the B. W. K. energy values are identical 
with those derived from an exact integration of the radial equation. 

The phase integral J{E,l) of Eq. (28*23) can be evaluated explicitly.^ 
We obtain 


J(E,l) - ~(Z + y2)h 


27rfxZe^ 


= (t^ + y2)h. 


Solving for JJ, one derives the familiar formula 


2ir^pZH^ _ NZ^hc 


(29*1) 


where N is the Rydberg constant in wave-number units and n is an 
abbreviation for the integer v + I + 1, The combination of the half- 
integral radial quantum number v + with the azimuthal quantum 
number Z, and the correction of the B. W. K. method for radial motion, 
to give the simple formula (29*1), with its integral denominator, is rather 
curious. 

The same result is readily obtainable along with the corresponding 
eigenfunctions by exact integration of the differential equation for (R(r). 
This equation was rigorously solved in Schrodinger's first paper and 
gave the initial impetus to this point of view. He treated the equation 
by a method of complex integration due to Poincar6 and Horn. The 
more elementary Sommerfeld polynomial method of getting the solution 
is described herewith. 

29b. Application of Pol3momial Method. — As a first step we express 
the enei'gy E and the radius r in terms of new units. As unit energy we 
choose the value 2r^/xe^Z^fh^ of the lowest energy level in the Bohr theory. 
As unit radius we choose the radius of the innermost Bohr orbit, m., 
h^/4ur^pe^Z » ao/Z, Then 

^ A. Soioc«BJ*s!LZ>, Ann, d, Pkpsik 51, 1 (1916), any edition of Sommerfeld's 
AUmi^u und SpektraUinien, or Buark and Urey, Atoms, MoUmU^^ c^nd Quanta^ 
Oiap; V, New York, 1980. ' 



Sbc. 29] 


THE HYDROGENIC ATOM 


159 


E-*€ 

V — > p 


Eh^ _ qo^kE 

rZ 

~ To 


The differential equation takes the simplified form 


d^(R 

dp- 



^ = 0„ 
P J 


(29-2) 


Following the procedure of Sec. 20 we drop the terms in p“^ and p"'^ in 
order to examine the behavior of the solutions for very large values of p. 
We find that 


(R(p) ^ 


P» 1 


In order to satisfy the boundary conditions it is necessary that c shall be 
negative. For the same reason the solution with the positive exponent 
must be rejected. To obtain a solution valid for small r we make the 
substitution 

(Si = p 1 

and obtain the differential equation 

pL" + 2IZ + 1 - pV^tW + 2[1 - (Z + l)^/^]L = 0 (29-3) 

for L. L should be regular at the origin, and we therefore assume the 
power series 

00 

L = (294) 

K-0 


which yields the recurrence formula 

Oh.i[(»' + 1)(2Z + ^ + 2)] = 2a,[(Z + e + 1 )a/^ - 1]. (29-5) 


The series breaks off with the »th term and yields a solution of the 
boundary, value problem if 


1 _ 1 

(e + Z + 1)» n^' 


(29-6) 


The corresponding energy values are given by (291). If e does not meet 
the condition (29-5), the series for L(p) behaves like for large 

values of p, while that for (R behaves like Hence (R becomes 

infinite at p = « . We conclude that the solutions obtained above are 
the only solutions which satisfy the boundary conditions. We may 
draw the same inference from the fact that the factor L,j(p), in the ptb 
eigenfunction for any given Z, is a polynomial of degree v and hence 
cannot have more than v nodes (c/. Sec. 28e). 



160 


DISCRETE SPECTRUM OF TWO-PARTICLE PROBLEM [Chaj*. V 


29c. Generalized Laguerre Poljmomials. — The polynomials L„i(p) 
defined by Eqs. (29-3) and (29*4) are identical to a constant factor with 
the generalized Laguerre polynomials described in Appen- 

dix G. This may be proved by using the substitution 

* . 

n nao 

to reduce Eq. (29*2) to the form of Eq. (G-8), Appendix G, The radial 
eigenfunctions now take the form 

Cilnz(r) = (29*7) 

The constant Cni can be determined to satisfy the normalization condi- 
tion (28*12). Then Eq. (G-12) of Appendix G yields 

I' RR'r'dr - - 1 . 

and the normalized radial wave function is 


Rnl{j) 


(n-l - 1) !/2Z Y Y2rZ Y 

2n[{n + Z)!]*\nao/ \nao / ^ 





(29*8) 


Multiplying together the different factors of our complete normalized 
wave functions, we obtain 


^nlfn{r,e,<p) 



(I - M)!(n ~ Z - 1)!" 
(Z+ H)!2n[(n + Z)!]^ 




X Fz.|m|(cOS 

\naQ/ ^ \nao/ 


(29-9) 


Tables containing the explicit forms of the radial wave functions for the 
different lower energy levels are given in Pauling and Goudsmit, The 
Structure of Line Spectra, New York^ 1930. In the case of the lowest 
energy level (normal state) of an hJr<lrogenic atom we have the single 

— ) e 
ao/ 

As the energy is fixed by the single quantum number n( = t) + Z + 1), 
the hydrogen atom in the present approximation (neglecting relativity 
and spin corrections) has a special degeneracy referable to the Coulomb 
law of force which does not occur in the general two-body problem. 
The energy level En has Z values ranging from 0 to w — 1. Hence the 
number of different sets of values of Z and m, i.e., the number of independ- 
ent factorable solutions, is 


n-l 

2 (2^ + 1) 


» n\ 



Sec. 29] 


THE HYDROGEN IC ATOM 


161 


29d. The Most General Eigenfunction. — ^The most general possible 
type A wave function for the energy level is a linear combination of the 
terms of the type To prove the point we assume that 

4^n{rfBj(p) is a type A eigenfunction corresponding to the energy En^ 
yp must then be periodic in ip with the period 27r. Hence we may expand 
it into the complex Fourier’s series 

+ 00 


As the functions for any given value of m, form a complete orthogonal 
series, we can expand each of the F’s in terms of them. Then 

-f- « 00 

— 00 l~\m\ 

If we substitute this series into P^q. (28-3) and make use of the fact that 
Bzm and are solutions of Eqs. (28-8) and (28-9), respectively, we obtain 


-f- «0 -j- 00 


m=a — 00 i= 5 |?/l| 




y _Kl+ 1 ) 


KT^ 


jo'/mj' = 0. 


Let this equation be multiplied through by and integrated 

term by term over all values of 6 and <p. Most of the terms drop out on 
account of the orthogonality properties of the 0 and ^ functions and the 
equation reduces to 



KV'^ J 


Cm' 


= 0. 


(29. 10) 


Equation (29*10) has the form of Eq. (28*5) with X = V{1' + 1). It 
reduces to the form (29*2) if we insert the expression —Ze^jr for F, 
introduce appropriate units, and change the dependent variable from 
G to (R = rG. Hence it has no nontrivial quadratically integrable 
solutions unless n == Z' + i? + 1, where v is an integer greater than or 
equal to zero. This means that all the G’s vanish for which V > n — I, 
The G’s which do not vanish identically are the radial eigenfunctions 
Rnv previously derived. Thus the th(iorem is proved. 

The space factor obtained in this manner is conveniently written 
in the form 


n-l 


If \l/n, Bnh are normalized. 


n-l 

Z a» 0 m « — Z 


(29-11) 



CHAPTER VI 


THE CONTINUOUS SPECTRUM AND THE BASIC PROPERTIES 
OF SOLUTIONS OF THE MANY-PARTICLE PROBLEM 


30. THE CONTINUOUS SPECTRUM IN ONE-DIMENSIONAL PROBLEMS 


30a. The Nature and Use of the Eigenfunctions of the Continuous 
Spectrum. — We have already seen in Sec. 19 that if the potential-energy 
function V{x) in a linear oscillator problem approaches a finite limit D 
as the coordinate x becomes infinite, the eigenvalue problem has no 
type A solutions for energies greater than Z). For every value of E 
greater than D, however, there is another type of eigenfunction (type B) 
which is bounded and continuous everywhere, but not quadratically 
intcgrable. Classical energies greater than D correspond to aperiodic 
motions in which the particle does not remain permanently in the neighbor- 
hood of the potential minimum but flies off to^ infinity after its first 
collision with the potential barrier on the left (c/. Fig. 6). As all energies 
greater than D are on the same footing, the problem of finding the 
allowed values of E in this range does not have to bo solved. We shall 
nevertheless be interested in wave packets whose average energy is 
greater than D. In order to develop a proper theory of the relative 
probabilities of different energies for such packets we should like to 


express arbitrary physically ‘admissible solutions of = 


Jl ^ 

2Ti dt 


linear combinations of single-energy functions. Resolutions of this 
type are also of considerable importance for perturbation theory. 

The desired resolution cannot be obtained with the aid of any linear 
combination of distinct type B eigenfunctions, for such a combination 
is never quadratically integrable. The difficulty can be avoided, how- 
ever, by using an integral over a continuous family of the continuous- 
spectrum eigenfunctions. We are already familiar with an example 
of the appropriate type of analysis through our application of the Fourier 
integral theorem to the wave packets of free particles. The integrands 
of Eqs. (9-4) and (10*3) are actually type B eigenfunctions of = j&V 
for free particles, in one dimension and three dimensions, respectively. 

The theory of the continuous spectrum is nevertheless appreciably 
more difficult to handle than the theory of discrete spectra. Hence it is of 
importance to iiote that for many purposes some or all of the difficulties 
C|an be sidestepped by making use of the fact that by a slight modifica- 
mm of many physical problems, of no practical importance as regards 

162 " 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 


163 


the final outcome, one can substitute a purely discrete spectrum for the 
discrete-continuous one. For this purpose one has only to place the 
system under discussion in a large imaginary box, requiring that ^ 
shall vanish whenever one of the particles which compose the system 
touches its surface. It is evident that if the box is large enough, it 
will be legitimate to assume that results computed for the modified 
problem are as good as those computed for the original one in which 
coordinate space extends to infinity. The box eliminates the continuous 
spectrum completely both in one-dimensional and many-dimensional 
problems. 

30b. The Weyl Theory. — In Chap. IV we have developed the required 
theory of the discrete eigenfunctions of the Sturm-Liouville equation 

(30-1) 

for the case in which the fundamental interval is bounded by singular 
points a and b. In Sec. 23 we detc^rinined the properties of solutions of 
(30*1) in the neighborhood of the boundary points in order that there 
shall be a discrete spectrum of eigenvalues extending through a finite, 
or infinite, interval of real values of X. In Sec. 25 we proved the com- 
pleteness of the system of discrete eigenfunctions for the case in which 
the discrete spectrum extends to + oo and so laid the foundation for the 
expansion theorem. What we now desire is a generalization of these 
results for cases in which the discrete spectrum does not extend to 
infinity. 

The problem has been treated by WeyP in a basic paper which 
unfortunately involves an elaborate mathematical technique and makes 
difiicult reading for the non-specialist. The class of problem discussed 
explicitly by Weyl is that in which (30T) is to be solved in the interval 
0 ^ ir < 00 , the left-hand boundary point being nonsingular and the 
boundary conditions applied at this point being of the homogeneous 
type. It is clear, however, that the validity of his conclusions will 
not be affected if the left-hand boundary is at an arbitrary singular 
point a, provided that we use the singular-point boundary condition 
at a and provided also that for any real X a unique choice of the non- 
mult'.plicative constant of integration gives a solution of (30T) which 
conforms to those boundary conditions and has a finite number of nodes 
in the neighborhood of a. We assume the same restrictions on the 
coefficients p(.Tj, f^^x) as before.^ 

^ Hbkmann Weyl, Math. Anuulc^ 220 (1910). Important contributions to 
the theory of the continuous spectmin have been made by E. Fues, Ann. d. 
Phydk (4) 81, 281 (1926) and J. R. Oppenheimer, /. Physik 41, 268 (1927), 
Phya. Rev. 31, 66 (1928). 

> The right-hand boundary point can also be brought in from inilnitv. but for our 
purpose this is unnecessary. 



164 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 

Let us next assume that the real values of X fall in one of two intervals 
S2i and K 2 having the following proi)erties. In 12i the right-hand bound- 
ary at a; = 00 has the same properties as the left-hand boundary. This 
is then a region in which a discrete spectrum can exist. In 0,^ it is not 
possible to satisfy the singular-point boundary conditions at x ~ -f « 
at all. 

Under the circumstances described we can deduce from Weyls 
paper and Sec. 23 that in the interval ih there will be a normal discrete 
spectrum or no spectrum of eigenvalues at all. In U 2 there is a continuous 
spectrum of eigenvalues whose (type B) eigenfunctions satisfy the s.p.b.c. 
at X = a and yield so-called eigemiifferentials which satisfy them both 
at X = a and at x = 00 . 

In order to define the eigendifferentials we assume that y(x,X) dehotes 
a real continuous function of x and X which for every fixed X in O 2 is a 
solution of (30T) conforming to the s.p.b.c. at x = a. It includes all 
suitably normalized type B eigenfunctions of the equation. Then the 
function 

Ary = (30-2) 

is called an eigendifferential if the interval Xr < X < Xr + lies in ^ 2 . 
The fact that Ar^/, unlike y{x,X), satisfies the s.p.b.c. at x = 00 for 
arbitrarily small values of rir is the criterion by which we identify y{XyK) 
as a type B eigenfunction and Xr as a continuous-spectrum eigenvalue. 

Let us now assume that /(x) is an arbitrary twice-differentiable 
function of x which conforms to the s.p.b.c. at x == a and yields convergent 
integrals 

p{x)\f\^dx-, (30-3) 

Then /(x) has an absolutely and uniformly convergent expansion of the 
form 

/(^) = ]£cn 2 /n(x) + r c(\)y(x,\)d\. (30-4) 

Jin 

Here the discrete eigenfunctions yn(x) are conveniently normalized in the 
usual way, but the type B eigenfunctions require a different normalization 
usually stated as the condition that^ 

= 1 (30-5) 

for all values of X in 122. The Fourier coeflScients Cn, c(\) are to be deter- 
mined by the formulas 

Cn =■ (p/(x), y«(x));^ c(X) = lim (30-6) 

^ If the ‘‘density function p is a constant — the usual case — it is best to omit p in 
(30*6), and (307). It then drops out of (30-8) and (30*9) automatically. 



Sec. 30J THE CONTINUOUS SPECTRUM IN ONE DIMENSION 165 


If f{x) is absolutely integrable and y(xj\) has an upper bound, the latter 
formula can be replaced by the simpler one 

c(X) = (pf(x), y{x,\)). (30*7) 

Weyl doevS not take up the question of completeness, but term-by- 
term integration gives the completeness relation 



(30-8) 


for the class of functions f(x) specified. Furthermore it is possible to 
extend this theorem to arbitrary piecewise continuous quadratically 
integrable functions by the method cited in Sec. 25. Finally we 
can use the method of Sec. 22 [c/. Eq. (22*32) ff.] to derive the scalar- 
product relation 


(fif,9) = 



(30-9) 


for arbitrary piecewise continuous functions /,gf, quadratically integrable 
with respect to p(x) and having tlie Fourier coefficients Cn, c(X), and bn, 
6(X), respectively. 

*30c. Formal Treatment of Continuous Spectrum as the Limit of a 
Discrete Spectrum. — In this and succ.eeding sec^tions we supplement the 
above summary of the results obtained by Wcyl with a heuristic discus- 
sion intended to motivate conclusions whose origin must otherwise be 
veiled in an atmosphere of mystery. For simplicity we restrict our 
elementary treatment of the problem to the special case of the linear 
oscillator equation (c/. Secs, 18 and 19) 


2," + - V(x)}y = 0. ^ (30*10) 

Here E plays the part of X in (301). We assume that V{x) has a pole 
of the first or higher order at the left-hand boundary point x ^ a and 
approaches a definite limit at x = + <» which we can identify with the 
zero level of the energy E.^ V{x) is assumed to be continuous at every 
interior point of the fundamental interval a ^ a: < + oo . 

Under these circumstances we can satisfy the singular-point boundary 
conditions ^ix - a for every E by choosing an integral curve y{x) which 
has a node at x = a. Similarly we can satisfy the s.p.b.c. at + oo for 
every negative energy by choosing an integral curve for which 

2/(+oo) = 0. 


Solutions of (30*10) which satisfy the s.p.b.c. at either end of the funda- 

1 The argument would be sensibly unchanged by supposing that the left-hand 
boundary point is at — and that lim F(a;) = 4- « . 



166 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


.mental interval have a finite number of zeros near that end point. 
There are no positive energy solutions which satisfy the s.p.b.c. at + 
or which are quadratically integrable over the entire fundamental 
interval. An integral curve of (30*10) for £ > 0 with a node oX x — a 
constitutes a type B oigenf unction. We designate by y(XjE) a real 
family of such eigenfunctions which is continuous in E, This family 
will be uniquely defined when we have fixed on a definite scheme for 
normalization. 

The problem to be dealt with is that of finding the eigenvalues and 
eigenfunctions of (30*10) for the infinite fundamental interval a ^ x < +oo 
and of setting up a scheme for expanding an arbitrary quadratically 
integrable and piece-by-piece continuous /(a:) spread out over that interval 
into a discrete-continuous linear combination of eigenfunctions. This 
we designate as problem a. For comparison purposes it is convenient to 
introduce a modified problem based on the same differential equation, 
but using a finite fundamental interval a ^ x ^ h. This second problem 
will be designated as problem /3. We employ the s.p.b.c. at the left-hand 
boundary point x = a and give the problem a purely discrete spectrum 
by requiring that all eigenfunctions shall vanish at the right-hand 
boundary, a: = 6. 

Every eigenfunction of /5 for a positive eigenvalue can be made to 
coincide throughout the interval a ^ x ^ b wdth the function y{XjE) for 
the same energy, provided that we choose a suitably modified normaliza- 
tion for the former in the place of the usual one. Let €«(?>) denote the 
nth discrete eigenvalue of problem and let Wnb{x) denote the correspond- 
ing eigenfunction. We adopt the modified normalization of the 
so that, for positive €n(b), 

Wr^ix) = y[Xy €n(b)]; a ^ X ^ b (30*11) 

= undefined. h < x ^ ^ ^ 

Hence it is not necessary for most purposes to distinguish between Wnb{x) 
and y[x, €n(6)] when the energy is positive. If €n(6) is negative, Wnb(x) 
does not coincide with an eigenfunction of a, although it is a solution of 
its differential equation. Figure 11 shows a possible potential-energy 
curve and illustrates qualitatively the nature of the eigenfunctions of 
problem 

As explained in Sec. 18, an increase in energy is always accompanied by 
a decrease in the spacing of the nodes of the integral curves and vice versa. 
Hence, if we consider only those integral curves of (30*10) which have a 
node at X a, a movement of the remaining nodes to the right must be 
accompanied by a decrease in the energy. Since the last node of each 
eigenfunction of problem 13 occurs at x = b, an increase in b must lower 
all the energy eigenvalues. If these energies are plotted as ordinates 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 


167 


against b as abscissa, one obtains a set of inonotonic decreasing curves 
as shown iji Fig. 12. 

Since the discrete eigenfunctions of problem a have nodes at infinity, 
it is clear that each class A solution of problem a. (if any) is the limit, 



Ficj. 11. — Illustrating problems a and (i. Eigenfunctions for two energy levels of problem 
/3 are drawn in, using the corresponding lines of constant energy as axes of absctissas. 

as b becomes infinite, of th(^ corresponding solution of problem There 
may be an infinite number of discrete energy levels, a finite number, or 
none at all, in tlie spectrum of problem a depending on the form of V{x). 
In the first mc^ntioned case every energy level of problem ^ goes into a 
discrete level of f)roblem a when b becomes infinite. Nevertheless, 



Fig. 12. — The energy levels of problem /3 as functions of h. 


even in this case the limit of the complete spectrum of |3, as b becomes 
infinite, is not a purely discrete spectrum, for, despite- the fact that 
positive eigenvalues are constantly becoming negative with increasing b, 
the number of levels in any interval 0 < t < E increases without limit 
as 6 — » op . In other words the spacing of the levels in the neighborhood 
of any fixed positive energy value approaches zero as 6 — > «. 



168 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 


*30d. The Spacing of Energy Levels in Problems and a. — The 

above proposition is conveniently proved by setting up an explicit 
formula for the spacing of the energy levels in problem /3. Such a 
formula is derived in Appendix H with the following result: 

Ae(n,6) ^ «„+,(6) - e„(b) = (30-12) 

^ Ja 

Here Zn(fc) is the distance from h to the preceding node of y[Xj €n(6)], 
VxW y €n(b')] is the derivative with respect to x of y[Xj €„(6')] at a: = 6', 
and 6' is a suitably chosen real number in the half wave-length interval 

h - ln(b) <¥ <b. 


It is further proved in Appendix H that any integral curve of (30-10) 
for a positive value of €, say y{x,E)^ can be fitted to a suitably chosen 
Brillouin-Wentzel-Kraniers approximation function [cf. Sec. 21a, Eq. 
(21-7)] 


u(XfE) 


Ap^ 


cos f 


27r r 

_ b Jzo 


p(XjE)dx + y 


= 2fi{E - V) (30-13) 


so that, for a: — a;o > 0, 

\y{x,E) - u(x,E)\ < ^ 


where ikf is a positive number which becomes infinite when E approaches 
zero. 

Let us now allow n and b to increase together from any pair of initial 
values no, 6o to infinity in such a manner that Cn(6) = Ey independent 
of n. 

Then' 


lim 

n,b—* 00 


ln(Jb) = 


h 

2y/2i^' 


(30-14) 


It follows from (30-13) that, to a closo approximation for largo values of b, 


4^2 i 2 ^ 

y.W, en{b')Y = 5^V2mI6»(6') - F(6')]. (30-16) 


We conclude that the numerator of the right-hand member cf Eq. (30-12) 
remains finite, as n and b become infinite. The denominator becomes 
infinite, however, so that 

lim A€(n,6) = 0. (30-16) 

n,6— 4 « 

1 This can be proved either by the approximation formula (30* 13) or by Sturm's 
fundamental oscillation theorem (qf. Ince, Ordinary Differential Equatiom^ p. 224). 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 


169 


Hence the density of the energy levels of problem in the neighborhood 
of any positive energy E becomes infinite with b, and the complete spectrum 
of the problem must go over into the sum of a discrete spectrum of 
negative energies and a continuous spectrum of positive energies. 

It follows from the definition of V that 

€n^l[6 - ln(b)] < en[b'] < €.[6 -- Inib)]. 

In view of (30-16) we infer that €n(6') approaches en(6) as b and n become 
infinite. Consequently (30*12), (30*14), and (30*15) yield 

y[x, €n(b)Ydx = = ~ (30*17) 

a Kti ^IJL 

In view of (30*17) it will simplify matters if we agree to normalize the 
family of type B eigenfunctions y{XjE) by the requirement that for all 
positive energies the coefficient A of (30*13) shall have the value 2\fyjh- 
The right-hand member of (3017) then reduces to unity, and we can 

replace J^yix, €n(h)ydx by (A€(n,6))“^ with an error which approaches 

zero when b and n become infinite together in such a way that €n(5) 
remains constant. 

*30e. The Eigendifferentials. — Consider next the sum 
defined by the equation 

€n(bXE^-^nr 

= 2 w,u.ix)Ae{n,b). (30-18) 

«n(6)>j&V 

For any fixed values of rr, JEr, rjrj we obtain an eigendifferential from the 
above expression by allowing b to become infinite. Thus 

^rV = ff^''^y{x,E')dE' = liin (30-19) 

•/■“t b—* 00 

Moreover, due to the orthogonality of the functions Wnb{x), 

£nibXEj.-^ri^ 

== 2 rW(x)]*rf4A*(n,6)]*. 

en{b)>E^ 

Substituting [A«(n,5)]"‘ for jr*'[io„6(a;)]’‘da: and taking the limit ash becomes 
infinite, we obtain 

«r+V 

Hm r[SSt’’'’(a;,6)]»da: = lim V A«(n,6) = i,r. 

b->Q 0 Jo ' b -^00 ^ 

Finally, by interchanging the order of two limiting processes, we obtain 
||Ary|l‘ = (Xy, Xy) = (30-20) 



1 70 CONTINUOUS SPECTRUM; MAN Y ■•PARTICLE PROBLEM [Chap. VI 


This formula indicates^ the quadratic integrability of try and is equivalent 
ta the normalization condition (30*5). As we have derived it on the 
assumption that the amplitude coefficient at infinity, A, has the value 
2\/fxfh, we see that (30*20) must be equivalent to giving A the value 
in question. 

The orthogonality of the eigendifferentials Ary for two non-overlapping 
intervals Er < E < Er + Vrf Er' < E < + v can be deduced as 

follows. If we substitute for Ary in (30*20) the sum Ary + we must 
evidently set the right-hand member equal to rjr + tjr'- Hence 

(ArJ/, Ary) + (Ar^yy Ar'y) + {Ary, Ar^y) + (A/2/7 ^ry) = + 1?r'. 

Then applying (30*20) to the individual eigendifferentials At2/, A^^y^ and 
remembering that Aj^y and Ary are real, we obtain 

(Ari/, Ar>y) = 0. 

The derivative of y{XyE)y{E > 0), with respect to x behaves in the 
same way at infinity as yiXyE) itself. This follows from the asymptotic 

B. W. K. approximation. Hence ^^ry is quadratically integrable with 

respect to x and must approach zero at infinity. We conclude that Ary 
satisfies the s.p.b.c. at both ends of the fundamental interval. It is in 
fact a type A function and belongs to a class of functions with respect to 

1 

which the operator + V{x)X is Hermitian (c/. Sec. 32d, 

p. 202). 

Let us now consider the function 


S{x) = <p{E)yix,E)dE, (30-21) 

where ^ is a piece-by-piece continuous function of E, and both E' and E" 
are positive. We can approximate this function for any fixed xhy a sum 
of products of eigendifferentials A^y each multiplied by the corresponding 
(p(Er). Forming the scalar product of two such approximations and 
taking the limit as the range of each eigendifferential becomes zero, 
we obtain 

[ Et<E" 

B'<Er 

This is the equivalent of PlancherePs theorem (9*6) for functions of the 
form of S{x). 

^ Of course our procedure has not been rigorous, owing to the interchange of the 
order of two limiting prr cesses. A rigorous proof of the quadratic integrability of 
Ary can be derived fn>m the B. W. K. approximation (30*13). 


= ff'\<Pm^dE. (30-22) 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 


171 


’’‘SOf. Passage from the Completeness Theorem for Problem to 
That for Problem a, — If the functions Wnb(x) are not normalized according 
to the usual rule for discrete eigenfunctions the completeness theorem 
for problem and quadratically integrable functions f{x) takes the 
form 


= '^\CnhV j\wr^{x)Ydx, (30*23) 

0 

where Cnb is the Fourier coefficient 


C-nb 


£f{x)Wnh{x)dx 


Let us assume that the functions Wnh{x)f whose energies are negative, 
are normalized in the usual manner for discrete eigenfunctions, while 
those for positive energies are normalized according to (30- 11). We 

insert the value J'y(x)wnb(x)dx (^J\wnbix)Ydx^ for Cnb when en{b) > 0. 
Then (30*23) becomes 


2 -^ f'’f(x)w„i(x)dx ^ f*’’ 

I^n6|=+ 2 j 

,.( 6 ) <0 ..( 6)^0 J„ lv>f^(^)?dx 


(3024) 


Let us now assume that f(x) is normalized to unity in the interval 

a < X < + 00 


and that it is absolutely, as well as quadratically, integrable in this 
interval. The latter condition together with the existence of an upper 
bound for the type B eigenfunctions y{XjE) permits us to define the 
expansion coefficient c(E) by the formula 

c{E) s jT * f{x)y(x,E)dx (30*25) 

in the assurance that the integral converges, since 

^f^“f{x)yix,E)dx^ ^ lfl\y(x,E)\dx 


In laying down the definition (30*25) we deliberately omit the factor 
p(x) = « of (30*7) in accordance with the suggestion of footnote 1, 
p. 164. 

In view of (30*17) and the normalization of A to the value 2\/p/A, 
we can replace J\wnb{x)Y^dx by A€(n,6)~^ with an error which approaches 
zero as b becomes infinite. This yields 



172 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 



2 £Six)y[x, = 1 . ( 30 - 26 ) 


««( 6 )^ 0 | 


In view of (30-25) the above equation goes over (formally) into 


X I"”!* + = 1- (30-27) 

all n 


As a corollary on (30*27) and (30*22) we infer that 

11/ - 2;c„y„(x) - f\{E)v{x,E)dEr = 0. (30-28) 

all n 


Finally, if g(x) is a second function of x quadratically integrable in 
a < X < 00 , and if hnyh{E) are its Fourier coefficients, 

(S,9) = -f £'°c{E)b*(E)dE. (30-29) 

all n 


In view of these relations we say that the totality of the type A and typo B 
eigenfunctions of problem a forms a complete system. 

An exact type A solution of (30-10) reduces IK^T — E)y\\ to zero if w(' 
1 

identify H with ^ function can be 

said to form a good approximate solution of (30-10) if it makes 
((H ~ E)yy{H -- E)y) 

for some value of E. From this point of view it is easy to use (30*29) to 
prove that the general eigendifferential Ay == actually a 

very good approximate type A solution of (30*10) when r? is small. Thus 


- 2^ Ay Ay) + .'(Ay Ay)] 

- + •‘’J- 


When € is given the value E + the above reduces to 
((H -- €)Ay,(g - €)Ay) ^ 

(AyAy) 12’ 

Hence we can make Ay as good an approximate type A eigenfunction as 
we please by a proper choice of rj in spite of the fact that lim Ay is not 

itself a type A function. 

Equation (30*4) can be set up formally for our special case by a 
procedure similar to that adopted for the completeness formula (30*29). 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 173 

It is, of course, to be expected that (30*27), (30*28), and (30*29) will 
hold for all quadratically integrable functions /,gf, whereavS (30*4) cannot 
be assumed valid outside the restricted class of function f{x) admitted 
by the Weyl theory. 

30g. The Fourier Integral Formulas a Special Case. — As a first simple 
illustration of the theory of continuous spectra let us consider the equa- 
tion of the linear oscillator (30*10) in the interval 0 < a: < oo with V{x) 
set equal to zero and the boundary condition y(0) == 0. This may be 
interpreted as the problem of a free particle in one dimension with an 
abrupt infinite potential barrier at the origin. The boundary condition 
is really a special case of the s.p.b.c. where the end point is not singular. 
The eigenfunctions are multiples of sin [{kE)^'^x\ and, when normalized 
in accordance with the rule following Eq. (3017) [or by Eq. (30*20)], take 
the definite form 

y{x,E) = 2^^(2mA’)-w sin[(-c£)%]. (30-30) 

We iSike f{x) to be an arbitrary continuous absolutely and quadratically 
integrable function defined in the interval 0 < a; < oo . The expansion 
(30*4) then becomes 

fix) = 2^^j\iE)i2yE)-yiHm[iKEy'^x]dE, 
where c{E), in accordance with (30*25), has the value 

ciE) = 2^^(2m£)-«J^”/(«) Hm[iKE)Hm. 

Replacing kE by 47rV", we obtain the symmetric Fourier integral formulas^ 
f(x) = 2 f G(<r) sin 27rx<T da,) 

(30-31) 

G((r) = 2 / f(x) sin 2wxa dx,j 

In case we abandon the boundary condition at the origin and suppose 
our free particle to range from — <» to -f qo, it is evident that we shall 
have to use a generalization of the theory of continuous spectra given 
above. The phase of the eigenfunctions is now completely indeter- 
minate and f(x) must be expanded in terms of the two linearly inde- 
pendent eigenfunctions sin 1(kE)^x] and cos [{kE)^x], In this case the 
eigenfunctions of the continuous spectrum are degenerate. Using appro- 
priate normalization and determining the expansion coefficients in the 
same way we obtain the Fourier integral formulas of Sec. 9. 

^ These formulas hold if /(—a;) ~ —/(pc), or, if f(0) *= 0 and /(a?) is undefined for 
negative values of x. They are readily derived from the form of the Fourier integral 
theorem used in Sec. 9. 



174 CONTINUOUS SPECTRUM; MANY^P ARTICLE PROBLEM [Chap. VI 

30h. Normal Packet Functions in One Dimension. — We have defined 
normal packet functions (Sec. 10) as quadratically integrable non- 
monochromatic solutions of Schrodinger^s second wave equation (5*10). 
Very general solutions of this kind are obtained by taking linear com- 
binations of eigenfunctions of the first Schrodiiiger equation, each mul- 
tiplied by an appropriate time factor. Thus, if H is the one-dimensional 

1 

Hamiltonian operator V{x) — ~ ^ for (30*10), and if the series-integral 

_2«‘ 2wt* 

^(x,t) = * + j\{E)^{x,E)e ^ dE (30-32) 

n 

is uniformly convergent, term-by-term application of the operator 
}i d 

H + ^ suffices to prove that it represents a solution of the second 

Schrodinger equation. If the function ^{x,t) defined by the series 
is quadratically integrable, it is a normal packet function. In See. 5 
it was proved that every complete wave function 4^(a;,0 is determined 
by its differential equation and its form at some arbitrary time, say / = 0. 
It follows that the most general normal packet function for (30*10) 
is obtainable by fitting the initial condition ^(xfl) = /(x), where f{x) 
is given the most general form for a type A function. 

But we know from the Weyl theory that if <p{x) is any function which 
is quadratically integrable, together with its transform H^pt over the 
interval a < x < oo, it is expansible into the uniformly convergent 
discrete-continuous linear combination of eigenfunctions of //, 

<p{x) = Xd„Mx) + j^’°diEmx,E)dE, (30-33) 

n 

We have only to identify the expansion coefficients in (30*32) with those in 
(30*33) in order to get a normal packet function which reduces to (p{x) 
Sit t ^ 0. 

It is not immediately evident that every type A function f{x) neces- 
sarily has a quadratically integrable transform by but in Sec. 32 this 
requirement will be added as a new restriction on the class of physically 
admissible functions. From this point of view we can then say that 
every physically admissible solution of the one-dimensional equation 

fjsff ^JL ^ satisfies the conditions imposed on <p{x) and is expressible 
Awt ot 

in the form (30-32). 

30i. NcomiO Packet Functions for the Two-^article Problem. — As a 

further illustration of the theory of continuous spectra let us return to 
the two-particle problem of Sec. 28. The equation for ’the radial 
eigenfunctions CR, viz., (28-19), is of the same tjrpe as Eq. (30-10). Hence 



175 


Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 

an arbitrary function of the radius which conforms to the Weyl condi- 
tions (p. 164) is capable of expansion in the form 


u{r) = 


n-Z + l 


« + f^°°^^c{E)(Riir,E)dE, 


(30-34) 


where (Rni and (Siiir^E) are normalized type A and type B eigenfunctions 
of Eq. (28*19), respectively. It follows that an essentially arbitrary 
normal packet function for the two-particle problem can be expressed in 
the form 


-f- eo 00 

— 00 Z a= Iw] 




C’nlmBnie 

4=Z4-1 

cim{E)Ri{r,E)e ^ ^ dE 

V{ oe) 


(30*35) 


To prove the above proposition we observe first of all that the above 
expression is a solution of the appropriate second Schrodinger wave 
equation 






SttV 


F4' 


h 

2in dt 


(30-36) 


In the second place it can be fitted to arbitrary physically admissible 
initial conditions. Let 


F{x,y,z) s f{r,e,<(>), 

for example, denote an arbitrary type A function of the Cartesian coordi- 
nates with a quadratically integrable transform HF, Following an 
argument which parallels that used at the close of Sec. 29 we expand 
f{r,d^ip) in the uniformly convergent series of spherical harmonics 


-}- 00 00 

firAp) = X X (30-37) 

m — — 00 Z<»|m| 


The coeflScients Gim are undetermined continuous functions of r. 

We can also expand Hf into a series of spherical harmonics 

+ 00 00 

X I"*- = (30-38) 

7^10* — 00 Zx>|m| 

Let us assume for the moment our right to apply the operator H terra by 
term to the expansion (30-37). Introducing the definitions 


Ar 


-All - §^F(r) - 




(30-39) 



176 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 
we readily derive 

Hf=% XHGi^Y,^ = X (30-40) 

ml ml 

Hence 

Fimir) = 

In Appendix I it is proved that the term-by-term application of the 
Hamiltonian operator H is valid and that the integrals 

are convergent. Thus <^im(:r) conforms to the conditions imposed on the 
function / of Eq. (30*4) by the Weyl theory and can be expanded into 
a series-integral combination of normal eigenfunctions and type B 
eigenfunctions for Eq, (28T9). We write 
00 

9im = X + (y Ci^{E)(Sii{r,E)dE, (30-41) 


which yields 

^ 00 00 r 00 

f{r,df<p) — 2) 2) ^nlmBnl+ J^^^^Clm(E)Ri{rfE)dE . 

I «=»/-+•! 

(30-42) 

Comparing Eqs. (30-35) with (30-42), we see that '^{r,6,<p,t) can be fitted 
to the arbitrary physically admissible initial function f{r,6,<p) with 
quadratically integrable transform Hf by a proper choice of the coeffi- 
cients c„im, This proves the theorem. 

If the functions ,Bim{9),R„i(r) and Ri{r,E) are normalized, the 
formulas for the coefficients are 


Cnim - = J* J* J'^'i'RniQim^^*e * ^“V-* sin d drdddip, 

Cni{E) = lim vv— ^ ^ (4^,A^(m[^]), 

r^iE^e ^ dE'\ 


(30-43) 

(30-44) 


The completeness of the system of eigenfunctions \l/nimf^im(E) in the 
sense of Eq. (30*8) is a corollary on the expansion theorem. 

*30j« Normalhiation of the Class B Radial Eigenfunctions for the 
Hydrog^c Atom. — In the special case of the hydrogenic-atom problem, 
where V (r) represents a Coulomb potential, it is possible to give explicit 
formulas for the radial eigenfunctions of the continuous spectrum in 



Sec. 30] THE CONTINUOUS SPECTRUM IN ONE DIMENSION 


177 


terms of complex integrals.' Similar formulas are available for the 
discrete eigenfunctions but will not be used here. The substitution 

= p‘'^‘<Sj(p,e) (30-46) 

in Eq. (29-2) yields the Laplace-type differential equation 
d% , 2(Jt + \)dSi , / , 

which can be reduced to the first order by Laplace's transformation. It 
has positive energy solutions of the form^ 


Si{p,e) = Ur I , ^ e^^(z - t\/€) (z + i^/e) dz, (30*47) 

J-tV« 

where Ur is a normalization factor whose value is to be determined. 
Useful expansions for small and large values of p can be derived from 
this integral. For large positive values of p Fu(is (see footnotf;) gives 
the asymptotic expression 

Si(py€) ^ —2iarp~^^'^^^[T cos 5(p,6) + S sin 5(p,€)], (30*48) 

where 


. \/\ 


KP)«) = pV« + log p; 

v« 

^ / + -^ («+2) 

S + iT =. (2v^) e T 


Elementary manipulation yields 




(30-49) 


6li(p,c) ^ —2iar(2\/e)^e 


2v^rn 


cos [5(p,€) + 


The value of the phase constant 0(ly€) is not worked out by Fues, but 
may be computed from formulas which he gives if desired. 

According to the normalization scheme of (30*20) we should convert 
(Hi(p,€) into an explicit function of r and E, and then choose Ur to meet the 
condition® 



dE^'dr 



^ Cf. E. SchrOdinger, Ann. d. Phyaik 79, 301 (1926). 

*<?/. L. SoHLBSiNGER, Einfukrung in die Theorie der getvohnlichen Differential- 
gleichungeuy 3d ed., Kap. 8, Berlin, 1922; E. Pubs, Ann. d. Physsik 80, 367 (1926), 81, 
281 (1926). 

* Of course we are at liberty to simplify our formulas by revising the method of 
normalization, using p as the variable of integration in Ibrming scalar products and 
substituting « for in (30-20) and in all formulas depending on the normalization 
scheme. The details can be left to the reader. 



178 CONTINUOUS SPECTRUM; MANY-PARriCLE PROBLEM [Chap. VI 


This i s equ ivalent to writing (Rj(p,«) in the form (30T3) with A set equal 
to 2\/p/A. In other words we must choose a, so that, for large values 
of r. 


^ (rZ a,\E\ _ 2 \.2_lil +J)] 

®‘\ao’ ) i^e^Zy\ P p* J 

/ 

. •/Pi 


X cos 


u 




This means that 


ar = 


3ir 




{2Te^Zy^^ 2^\r{l + 1 - 


(30-50) 


To obtain the complete expression for the normalized eigenfunctions of 
the continuous spectrum we combine Eqs. (30-45), (30-47), and (30-50), 
replace p by rZ/ao and multiply in the normalized exprpssions for Bim 
and 4>«. Thus 




y / 


H JZ. 


1 - 


1(21 4 - l) a - \m\)\(rZV 
2(f + ]mj)! \oo/ 


27rao(2V'*)'+^|r(f + 

+‘^ TjZ 

X 7*zim|(cos I e “"'(2 — t\/«) v^(z + t\/e) ^'dz. (30-51) 

The absolute value of the gamma function v(l + l ^ ) can be 

\ V €/ 

calculated by means of an infinite product given by Fues {loc. cit., 
p. 303). 


81. WEAK QUANTIZATION. THEORY OF RADIOACTIVE EMISSION OF 

ALPHA PARTICLES 

31a. Weak Quantization in General. — It will be of interest to consider 
at this point the application of our theory of continuous spectra to 
problems involving imperfect, or weak, quantization. 

In the Bohr theory, it will be remembered, a quantized, or stationary, 
state was one involving a multiply periodic classical motion and a discrete 
energy level determined by the Sopimerfeld quantum conditions. It 
.was early recognized by Bohr that where an approximate theory of an 
atomic systern gives su|h quantized states with sharply defined energy 
levels, a more exact theory may show that the classical motions ^are only 
appro3fimately of the multiply periodic type and that in consequence 



Sec. 31] 


WEAK QUANTIZATION 


179 


the quantum conditions are not really applicable after all. Bohr had 
no scheme for adapting quantum properties to motions of other types 
than the multiply periodic one. Hence he had always to begin work by 
replacing the complete classical theory of the system under consideration 
by an idealized one having multiply periodic motions. It was necessary, 
for example, to discard radiation forces and interatomic collisions in 
order to set up a model of the hydrogen atom to which the quantum 
conditions would be applicable. The deficiencies of his theory were 
frankly faced by Bohr and he foresaw at an early date the probability 
that in cases where the classical motion is only approximately of the 
multiply periodic type a more complete theory would replace the sharply 
defined energy levels of his own theoretical methods with a narrow range 
of preferred energy values.^ 

Bohr\s expectation has been fulfilled by the development of the 
quantum-mechanical theory of states of imperfect quantization. Both 
theoretically and experimentally we find states intermediate in character 
b(itween the sharply defined stationary states of the typical discrete 
energy level and the communistic array of qualitatively similar states 
of a free electron. These intermediate states will hereafter be referred 
to as weakly quantized on account of their energy uncertainty. Con- 
trary to what one might suppose from the above introductory remarks, 
they do not occur whenever a classical analysis would suggest an aperiodic 
motion, nor are they always excluded in cases in which classical theory 
does give periodic, or multiply periodic, motions. They are described 
by wave packets built up from continuous-spectrum eigenfunctions which 
approximate the properties of discrete eigenfunctions. Interpreting 
states with discrete- and continuous-spectrum eigenfunctions as the 
analogues of the classical multiply periodic and aperiodic motions, the 
parallelism between the quantum-mechanical situation and that pre- 
visaged by the Bohr theory becomes apparent. 

31b. A Model for Alpha Particle Disintegration. — As. an example we 
shall consider here the imperfectly quantized nuclear energy levels of the 
Gamow-Gurney-Condon theory of alpha-disintegration.^ This theory 
is built on a simplified model which replaces the actual nucleus by a 
single alpha particle moving in a central force field. The model is 
similar in character to those used in dealing with optical atomic spectra, 
diatomic molecular spectra, and collisions between electrons and atoms, 
as two-particle problems. In the case under consideration the field is 
assumed to be of the Coulomb type down to a radius of the order of 

^C/. N. Bohr, Zeits. /. Phystk 18, 117 (1923); W. PAuli, Jr., in Geiger and 
8 chbbl*s Handbuch der Phystk^ vol. XXIII, p. 68, Berlin, 1926. 

* C/. G. Gamow, ZeiiH. f. Phystk 51, 204 (1928); R. W. Gurney and E. U. Condon, 
Phys. Rev. 88, 127 (1929); M. Born, ZeUs. /. Physik 68, 306 (1929); H. Casimib, 
Phystea 1, 193 (1934). 



180 CONTINUOUS SPECTRUM; MA NY-P ARTICLE PROBLEM [Chap. VI 


cm., beyond which it passes through a maximum and then falls off 

rapidly to take on a minimum value at the origin. The potential hole 

at the center is analogous to the crater of a volcano, and one would 

actually get a volcano-shaped model if he plotted the potential energy as 

ordinate over a two-dimensional section of the field about the nucleus. 

The assumption of a Coulomb field outside the rim of the crater is well 

grounded on scattering experiments and the theory of atomic spectra. 

The crater itself is needed to provide the weakly quantized initial states 

from which the radioactive disintegration proceeds. If we wish to use 

the language of a less schematic theory we speak of a radioactive atom 

^ as a system of protons and neutrons, or 

alpha particles, neutrons, and protons in 

y \ ^-sc-y=v,(r) a weakly quantized state. 

y' From a classical point of view this 

A. model yields no quasi-periodic motions. 

/ \ Alpha particles inside the crater have 

/ periodic or multiply periodic motions if 

— r^NT — ^ energy is less than F^ax, while alpha 

/l i v(r) particles outside the crater, or having an 

y\ j energy greater than F,uax, have purely 

j j aperiodic motions. From the stand- 

^ point of quantum mechanics, however, 

Fig. 13.— The volcano model for alpha- situation is quite different. Owing 

particle disintegration. 

to the tunnel effect discussed in Sec. 
21 the wave function of an alpha particle initially inside the crater, 
but having an energy greater than F(qo) and less than F^ax, will 
in time leak through the crater wall. It follows that, as time goes on, 
the probability that the particle is inside the crater will decrease from 
its unit initial value, while the probability that it is outside increases 
correspondingly. Thus an asseinblage of similar models described by such 
a ^‘leaking wave function would be observed to disintegrate spontane- 
ously after the manner of real radioactive nuclei. 

It is necessary to distinguish between those cases in which the crater 
is deep and wide enough to permit the existence of one or more discrete 
negative energy levels, and those in which the crater is too shallow for 
negative energies. In the former case there will be a lowest discrete 
energy level correlated with a nonradioactive normal state of the nuclear 
model. Only excited states of the nucleus will disintegrate spontane- 
oudy, for only excited states with positive energies will have a finite 
potential barrier to tunnel tmder — ^the tunnel must be at constant energy, 
i.6., horizontal on such a graph as Fig. 13. In the latter case, however, 
any initial wave function representing a state in which the alpha particle 
starts out inside the crater will necessarily have a positive range of 



Sec. 31] 


WEAK QUANTIZATION 


181 


energies and will therefore show the radioactive property. The normal 
stale of a non-disintegrated nucleus is then defined as the least energetic 
of the states of weak quantization which we are about to investigate. 

31c. Resonant Energy Intervals. — There are two fundamental pro- 
cedures for the study of the problenyn hand. One of these is to introduce 
a pair of modified problems having the potential energies indicated by 
the graphs of Fi(r) and V 2 {r) in Fig. 13. Then the Vi problem will 
have a set of discrete eigenfunctions located almost entirely within the 
original crater, while the V 2 problem will have a purely continuous 
spectrum whose eigenfunctions hardly penetrate the crater at all. The 
actual problem may be regarded as a combination of the two others with 
coupling, and the process of radioactive disintegration interpreted as 
consisting of transitions between almost orthogonal states of the same 
energy. From this point of view the problem is amenable to perturba- 
tion methods similar to those developed in Chap. XII. A great advan- 
tage of the method is that it yields a common basis for attack on all 
problems of weak quantization, including, for example, ()i)penheimer^s 
first work on the spontaneous ionization of atoms in electric fields^ 
and the theory of the broadening of energy levels by the radiation 
process. 2 

The second procedure, used by Gamow and by Gurney and Condon, 
is based on direct application of the B. W. K. approximation method. 
This procedure lends itself to graphical presentation and is adopted here. 

The differential equations to be solved in dealing with our model are 
of the type already encountered in our initial study of the two-parti(;le 
problem in Sec. 28. States of any angular momentum which leavers an 
adequate minimum in the effective potential-energy function are possible. 
The radial equation can be written in the form 

= Ea (31 -1) 

where Hi is the effective radial Hamiltonian operator defined by 

+ < 312 ) 

Here k is again the product of Sir-fh^ into the reduced mass of the system, 
but the latter is no longer approximately equal to the electronic mass. 
We must give it the much larger value m = niaiM — me) jM^ where ma 
is the mass of the alpha particle and M is the total mass of the disinte- 
grating nucleus.® 

^ J. R. Oppekheimbr, P%s. Uev, 31, 66 (1928). This seems to be the first applica- 
tion of quantum mechanics to a problem of this type. 

* CJ, V. Weisskopf and E. Wigner, Zeiis. /. Physik 63, 64 (1930). 

® The reader should note at this point, if he has not already done so, that the use 
of the nonrelativistic Equation (31 1) commits us to the approximation of negleelfag 
the variation in the mass of the alpha particle with velocity. 



182 CONTINUOUS SPECTRUM; MANY-PAETICLE PROBLEM [Chap. VI 


The boundary condition to be used at the origin is the requirement 
that (R(0) = 0. As no solution of (31*1) for > 0 is quadratically 
integrable at infinity, it follows from Sec. 30 that all positive energies 
belong to the continuous spectrum, and that every function 
which vanishes at the origin, is a typt B eigenfunction. For large values 
of r these functions reduce to sine waves of length h/p{ oo ^E) = hji^pE)^, 
The amplitude at infinity of the normalized real waves in which we are 
chiefly interested is (c/. Sec, 30d). Thus it decreases regularly 

with incre^asing energy. Inside the crater, however, the variation of 
amplitude with energy is of a different character. In the latter region 
there is a resonance phenomenon which is the fundamental cause of the 
weak quantization. 

In order to fix our ideas let us assume that we have to do with a 
special problem in which the azimuthal quantum number I is zero. The 
curves y — E and y — V{r) will then crovss at two classical turning points 
ri and r 2 , shown in Fig. 13, and similar to the turning points x\ and 

of Fig. 7. (When I does not vanish, the effective potential V + 

has a pole of the second order at the origin and there are either three 
classical turning points, or only one.) Let I, II, III denote respectively 
the intervals 0 < r < ri, ri < r < r 2 , < r. Tracing the course of 

the integral curve representing a type B eigenfunction, we note that it 
oscillates in I and III, but is convex to the r axis in II, as if strongly 
repelled by that axis. On account of the latter property |(i{(r,JS)l 
becomes much larger near the turning point than in I, except for 
energies in the neighborhood of those which yield monotonically decreas- 
ing values of \(R\ throughout the region II. The special energies in 
question are evidently approximately the same as the discrete eigenvalues 
of the modified problem previously mentioned with V replaced by Fi. 
For these energies the amplitude in the crater is large compared with that 
outside the crater, and hence large in an absolute senSe if the amplitude 
at infinity is fixed by the normalization condition. For energies much 
removed from these resonance values the amplitude inside the crater is 
necessarily small. 

It is of interest to employ the B. W. K. method to locate the resonance 
maxima and the widths of the peaks on the resonance curve. For this 
purpose we assume that the first and second derivatives of V vanish at 
the origin. Then the discrepancy function Q of (21 T2) vanishes at the 
origin. In the neighborhood of the origin 

(R(r,E) - 2B(£)p(r,J5)-^ cos - |] 

. - 2B{E)p(r,E^-^ + | - (31-3) 



Sec. 31] WEAK QUANTIZATION 183 

where B{E) is a normalizing factor to be determined, and J{E) denotes 
the Sommerfeld phase integral 2^^\p{X,E)\d^, In order to get a connec- 
tion between the intervals I and II we shall assume that the mountain’^ 
to be tunneled through is high enough above the energy E so that the 
B. W. K. approximations are good in the middle of the region II. We 
now make use of the connection formulas (21*9) and (21*31). The 
former must be used “against the arrow,’’ which becomes possible if we 
introduce a small unknown phase angle e{E) and rewrite (21*9) in the 
form 

2p“^ — I + e j (314) 

I ; I I I 

I II 

Let Eq. (3L3) be rewritten in the form 

(S\{f,E) = “ i ■*■ *) i 

+ ~ 5 ~ 5 '*’ *)]' 

Equations (31*4) and (21*31) now give for the good portion of II(i.e., 
the part of the interval in which \Q/'p^\ 1) 

(K(r,£)ii = cos^tt^ — ^ 

+ sinf ”“4 + 7 ^ ^ * (31*6) 

From the above equation it is evident that if cos {vJ /h — 7r/4 + e) 
is zero, \(R{r,E)ii\ will decrease monotonically throughout the good portion 
of II. We infer that the amplitude of the oscillations in I will be large 
compared with those in II for some energy values not very different from 
those given by the condition that ttJ/A — 7r/4 shall be an odd multiple 
of 7r/2. In other words an approximate condition for resonance is that 

J(E) = (n + H)h. (31*7) 

This is just the B. W. K. approximation condition for the location of the 
discrete eigenvalues of the modified problem with V replaced by Vi. 
The appearance of the “quarter-integral” quantum number is due to the 
special boundary condition at the origin and has nothing to do with 
the problem of weak quantization. 

In order to complete our task we have to bridge the gap between II 
and III with the aid of connection formulas (21*8) and (21*30), or an 
equivalent. As the latter formula would have to be used “against the 



184 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


arrow/’ it is simpler to make direct application of the matrix connection 


formulas (21-25). 


Let K denote the integral 


2t 



and let <p stand 


for the quantity (tJ/A — t/ 4 + e). Choosing that branch of the function 
which makes the product real in II, wc readily reduce the 

expression for (R(r,^)ii to the appropriate standard form for the con- 
nection formulas, viz.^ 


(Si{r,E)n 



2 cos ip 


-t7>« 


+ sin (p 



(31-8) 

This is of the form aufu + avfv with fu and fv defined as in Sec. 21e and 
the coefficients 


au = B sin tpe ^ = 2B cos ip (31*9) 

The corresponding coefficients in III are accordingly determined by 


ir 

fiu = fiv* = Be^[}y2 sin <p + i{X sin (p ^ — 2 cos p e^')], 


where iX is the unknown imaginary part of g^u [cf. Eq. (21-29), however]. 
The radial wave function in III is consequently expressible as 


(5i{r,E)m = cos^^J^ jjdf + 7^ (31-10) 

We need not concern ourselves with the phase angle y. \Pu\ is to be 
normalized to the standard value 2^^{iJLlh^E)^*. Hence 


£2 


/A2iE'Y^rsin2^ 
W) L 4 


+ {X sin ip 


2cos^e^)2 . (31-11) 


In spite of the uncertainty in the value of X, this equation enables us 
to determine the position and character of the resonance regions. |X| is 
of course less than \guu\ and, if there is a good path F through the complex 
r plane connecting the good portion of II with the good portion of III, it 
follows from the inequality (21-29) that \guu\ is very much less than e^. 
Thus \Xe^^\ <C 1. Inspection of (31-11) shows that B can be large only 
when cos <Pf the cofactor of the very large quantity on the right, is 
very nearly zero. Hence we can set sin (p equal to +1 without appreci- 
able error in the range of resonance values with which we are concerned. 
Then 


B-® = cos ^ cosvj. (3M2) 

Except in the immediate neighborhood of points for which cos ^ = 0, 
the last term is by far the largest and its variation with tp will evidently 



Sec. 31] 


WEAK QUANTIZATION 


186 


be the controlling factor in determining the position and shape of the 
resonance bands. To get the point of maximum resonance we have to 
set the derivative of the right-hand member of (31*12) with respect to the 
energy equal to zero. 

In calculating this derivative we can assume that X is insensitive to 
small energy changes. In fact, it would be completely independent of 
the energy if the potential V(r) were exactly linear over the range of r 
values bridged by the connection formulas in passing from II to III. 


Furthermore, the derivatives of ^ = 


irJ 

T 



€ and K are of the same 


order of magnitude. Bewaring these items in mind it is possible to neglect 
a number of terms in the derivative and one rc^adily verifies that, to a 
close approximation. 



dJ 


z,(2 cos (p 


-X). 


Setting this derivative equal to zero, we find that the condition for 
maximum resonance is that 



As the right-hand member of this equation is small compared with unity 
when a good path around in the complex plane exists, we may regard 
this equation as an approximate confirmation of (31*7). It does not 
seem possible to improve on (31*7) with the mathematical tools we are 
using. The value of for the resonance position JS' = JS* is 


V 8m y 4 


(3114) 


In other words is times as great as the corresponding squared 
amplitude at infinity. 

We proceed to the computation of the half breadth of a reso- 
nance peak, which we define as the absolute value of the energy change 
required to reduce B'^ from its maximum value to half maximum value. 
In making the calculation we shall treat X, K, and the factor {h^E/8fx)^ 
on the right side of (31*12) as constants over the resonance interval. 
Inserting for B^^ in (31*12) twice the value given by (31*14), we obtain 


I = {Xe-^ -2co8<pe‘^y. 


(31-15) 


Let (p„ and <pyi denote the roots of (31-13) and (3115), respectively, which 
belong to the resonance region of Eh. Then these equations give 

e-iK 

cos (p^ « cos Ipm ± —j~ • 



186 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 
The desired half breadth is accordingly 



The quantity dE/dJ which appears in this equation is well known to be 
the classical frequency oi{E) of a particle vibrating between the turning 
points r = 0 and r = ri under the influence of the potential energy V{r). 
Hence our result can be given the final form 

AEk = h(^{Ek)—r-^- (3M6) 

In practice the number is exceedingly small, although it 

becomes equal to unity for an energy Eu which touches the rim of the 
crater. The variations in 03{E) can be neglected in comparison with those 
of its cofactor in (31T6). Thus Eqs. (31*14) and (31*16) point to exceed- 
ingly sharp resonance with high maximum amplitude ratios. It will 
be observed that the upper resonance peaks are broader than the lower 
ones. 

31d, Energy Distribution in Weakly Quantized States. — It will be 
immediately evident that the existence of these resonance energies implies 
the existence of corresponding approximately monochromatic^ quadratically 

h 

integrable solutions of the Schrodinger equation 

Zirt ot 

having the property that is nearly independent of time. One can 
set up the wave function of such a quasi-monochromatic quasi-stationary 
state in various ways, one of which is to choose an initial wave form which 
is quadratically integrable but nevertheless everywhere an approximate 
solution of = E'^ for a resonance energy E = Ek. It will suffice 
to set the angular momentum ‘equal to zero, in which case the wave 
function reduces to a radial factor which we designate as G{r,t) and the 

h 

operator Hi to Ho [c/. Eq. (31 T)]. la other words, ^ 

Ain ot 

reduces to HoG'(r,0 = 

Let us now identify G{r,0) with the kth discrete eigenfunction of 
Ho'w « EUf choosing for Ho' the radial Hamiltonian of the modified 
problem previously mentioned in which the potential function of the 
model, viz,, V{r), has been replaced by a function Fi(r) which is equal to 
V from the origin to the apex of the crater but continues to rise indefi- 
nitely as r increases beyond that point. Such an initial function Uk{r) 
will be sensibly equal to zero for all values of r for which Ho differs 
appreciably from Ho- Hence it is an approximate solution of Ho6l = Eu^ 
for all values of r in the sense that {Ho — Ek)uk is everywhere small 
compared with the amplitude of Uk inside the crater. 



Sec. 31 ] 


WEAK QUANTIZATION 


187 


Our initial surmise that the probability density will be nearly 
constant in time is supported by a computation of the derivatives of 
\G{r,t)\^ at the initial instant when G reduces to Uh. For example, 

= - E,)u,,* - Wi*(Ho - £;*)«*] ^ 0. (31-17) 

Postponing a fuller discussion of the variation of th(i wave function 
with time, we pause to consider the frequency, or energy, distribution 
which follows from our choice of (?(r,0). To this end it is necessary to 
express (?(r,0 in the form 

/•-i- 2i nEt 

Gir,t) ^ f_\(E)6i(r,E)e ^^dE, (3M8) 

where (R{r,E) is a normalized continuous-spectrum eigenfunction of 
(311). By (30*25), 

a{E) = G{r,0)6i(r,E)dr == J^’^Uk6{{r,E)dr. (31*19) 

a(E) will clearly be negligible except in that particular resonance region 
to which Uk belongs. For this narrow range of energies and for that 
range of r values for which Uk is appreciably different from zero, we 
can treat 6i{r,E) as a multiple of Uk. Furthermore, the ratio of (R(r,J?) 
to Uk in the intervals under consideration must be proportional to B{E). 
Thus, to a close approximation, 

«(£?) = = gB(_E), (31-20) 

where ^ is a constant with which we are not concerned. 

In accordance with optical analogy we may assume that \a{E)\HE 
is a measure of the probability of the energy interval dE for systems in a 
state described by G(r,0. This postulate will be more fully justified in 
the discussions of Secs. 35 and 36. If the postulate be granted, we can 
infer that the energy distribution in the weakly quantized state is given 
by B{Ey to a constant factor of proportionality. Hence we can identify 
the quantity AEk with the approximate uncertainty of the energy 
associated with this state. 

31e. The Disintegration Process. — ^The. final step in our discussion is 
to show that, if we start with a weakly quantized state, the amplitude 
of the wave function inside the crater decays exponentially in time, while 
the integrated intensity beyond the barrier increases to compensate. 
When this has been proved we shall have verified the experimental decay 
law for alpha-particle disintegration. 



188 CONTINVOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


The problem can be attacked in a number of ways of which that 
suggested by von Laue^ is the simplest. Laue makes use of a quasi- 
classical argument. From the prequantum point of view we should 
expect from our model that the alpha particle would vibrate back and 
forth between r = 0 and r = ri with a frequency (aiEk) until the time 
that it actually passes through the barrier. Our study of the trans- 
mission of matter waves through potential barriers has shown that in 
the case of a well-rounded potential hill the transmission coefficient 
is r = 1/(1 + e^^). We should accordingly expect that at each impact 
the classical particle would have a chance 1/(1 -f- of escape. Con- 
sider an assemblage of similar models of which N{t) remain in the initial 
not disintegrated state at the time t. The number which escape per 
second will then be equal to N{t)/{\ + multiplied by the number of 
impacts per second 03{Ek). Thus we arrive at the rule 

dN{t) __ NUMEk) 
dt' [1 + 

Integrating, we obtain 

mt) = iV(0)e-x‘; X = (31-21) 

This expression for the decay constant X is perhaps as. good as any. 
The formula makes X depend primarily on K{Ek)^ Let us assume that 
the potential functions F(r) vary rather slowly with the atomic number 
of the atom under consideration. Then the change of K from one set of 
alpha particles to another will be due primarily to the change of K with 
energy rather than with Z, In the case of a parabolic hill — which the 
actual barrier probably approximates very poorly — -dK/dE is constant, 
being equal to a multiple of the period of vibration of a classical particle 
moving under the influence of a potential — V{r) and therefore subject 
to an elastic restoring force. For the values of K actually to be con- 
sidered ^ 1. Hence, in roughest first approximation, 

= constant > 0. (31-22) 

As the energy E of the initial weakly quantized state is the same as the 
energy of the escaping alpha particles, the above equation is seen to be an 
approximate form of the empirical Geiger-Nuttall relation. The theo- 
retical inaccuracy of this relation is paralleled by its experimental 
inaccuracy. 

Another, and theoretically somewhat more satisfactory, method of 
attack sticks to the wave point of view throughout, although making 
some use of physical and mathematical intuition. We know from Sec. 

* M* VON Laub, Zeits. f, Phynk 92, 726 (1928). 



Sec. 31] 


WEAK QUANTIZATION 


189 


2lj that in the case of a potential barrier in one dimension flanked by 
regions of positive kinetic energy there are solutions of the equation 
H\l/ = E\l/ for every energy, which can be interpreted as descriptions of 
steady primary streams of particles incident on the barrier from the left 
or right, as the case may be, together with corresponding reflected and 
transmitted streams. The notion of mass or probability current 
density is equally useful in connection with our nuclear model. Let 
^ = G(r,t)/r denote a normalized solution of the second Schrodinger 
equation for the problem rcipresenting a state of zero angular momentum. 
Then 


~ r r^dr r sin d dB { 
ttfjo Jo Jo 



- G 


BG 

dr 


') : 


(31-23) 


Let e[(?] denote the function —.((7*^ — G^~ Y It follows that in this 

fit\ dr dr ) 

case we can identify C[(j]r' with the instantaneous total current of 
probability crossing a sphere of radius r' in the outward direction. 
Similarly, in the case of a solution of the radial equation = j^(R we 
can identify C[(K] with the relative radial current of an hypothetical 
infinite stream of partick\s (c/. Sec. 8, p. 31). 

Let/M(r,£) and denote the B. W. K. approximations 


2Ti i 

fu = * J 


. p((,E)dt 




2in 

~~h 






using the same conventions for the evaluation of p and from the 
equation p^ = 2ti,(E — V) as in Sec. 2lj. Then we see from Sec. 21j that 
a solution of the radial equation which has the form (R{r,E) = fni^fE) 
near the origin can be interpreted as a stream of particles moving away 
from the barrier, f.c., inward^ in that region and having a relative current 
value 47r/iu. Similarly an/„ solution represents an equal stream moving 
outward from the origin toward the barrier. In any region where p is 
real, the current, for a linear combination Uufu + a^fv with constant 
coeflicients, is equal to the sum of the currents for the two terms taken 
separately. Hence, we can regard such a function as a description 
of a superposition of two streams moving in opposite directions. Outside 
the volcano in III the same interpretation is possible except that /« 
gives an outward radial current of magnitude and /„ an inward 

radial current of magnitude In the case of a continuous- 

spectrum eigenfunction the requirement that (^{r^E) shall vanish at the 
origin provides for total reflection at the origin and makes the net current 
zero. If (R(r,jE) has a larger amplitude in I than in III, the diverging 
and converging partial currents inside the crater will be correspondingly 



190 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap, VI 


large compared with the partial currents outside. Whether this condi- 
tion will (ibtain, or its reverse, is evidently a matter of phases. If we set 
E equal to the eigenvalue Ek of the Vi problem, the phases are such as 
approximately to minimize the external currents relative to the internal 
currents. 

Consider next the wave function (j(r,0 generated by a weakly quan- 
tized initial state In Sec. 31d we found it convenient to identify 

(?(r,0) with the fcth discrete eigenfunction of the modified Vi problem 
whose equation is H^ u = Eu. For our present purpose it is better to 
give G(r jO) some such form as 

6X^,0) = e~^^6{ir,Ek)f 

choosing n as a positive integer equal to or greater than 2, and giving a a 
value such that the modulating factor diffc^rs but little from unity at 
the turning point r = ri yet approaches zero rather rapidly outside the 
crater. (R(r,£\.) is here a real class B eigenfunction of the radial equation. 
If the barrier is a high one, we can fix n and a so that G(r,0) is sensibly 
equal to 6{(ryEk) inside the crater, when^as the chance that the particle is 
initially outside the crater is very small. The definition of G(r,0) will 
then differ so little from that previously used that the analysis of See. 3 Id 
will be unaffected. 

Physical — or is it mathematical? — intuition now tells us what to 
expect for the behavior of Cr(r,0 in time with such an initial state; and 
some readers, at any rate, will take a greater satisfaction in finding an 
intuitional explanation of the observed phenomenon than in finding a 
formal mathematical proof without an intuitional background. 

At the initial instant we have inside the crater a superposition of two 
sets of approximately monochromatic waves of equal amplitude progress- 
ing in opposite directions. ^ There is total reflection at the origin and very 
nearly total, though gradual, reflection by the crater wall. The small 
outward transmission through this wall is compensated by a reverse 
transmission from similar progressive waves on the outside. The latter, 
however, are damped radially and are essentially finite wave trains. 
We may accordingly expect the front of the outward-moving train (the 
fu train) to progress away from the origin with a speed equal to the group 
velocity while the tail of the inward-moving train contracts toward the 
origin with the same speed. Thus the incoming transmission through 
the barrier will quickly disappear and we shall have a continual unbal- 
anced outward flow of probability through the crater wall. If the ratio 
of the net outward current computed for, say, the outer turning point 

to the integrated wave intensity inside the crater were constant, 

; ^ The progress is to be measured by the group velocity rather than by the phase 
velocity. 



WEAK QUANTIZATION 


Sec. 31] 


191 


the probability that the alpha particle is inside the crater would decay 
exponentially in time. In that case wo should have 


^ r — == X = constant. (31-24) 

Actually this ratio will not be exactly constant, but it will be very nearly 
so, for by Sec, 31d, and for points inside the crater, we can replace (R(r,JS7) 
to a close approximation by (R(r,Ek) in (31*18). G(r,t) immediately 
breaks into the product of 6{(r,Ek) and a function of the time, say <p(t). 
Thus the wave form of the waves trapped inside the crater must be 
nearly constant in time, and the transmission coefficient for the outgoing 
waves incident on the barrier must be at all times very nearly equal to 
the transmission coefficient for a train of waves of uniform amplitude 
and energy Ek incident on that barrier. The emergent (lurrent is then 
equal to the current reflected from the origin multiplied by 1/(1 + e^^). 
To get the latter current we must resolve G('*>0 the neighborhood of 
the origin into the linear combination 

G{r,t) = auU + (ivU (31-25) 


The outward current at the origin is then G>[aufu] 



Thus 


To complete the computation must be evaluated in terms of 

loul''. For this purpose it is convenient to reduce (31-25) to the form 


G = sin 



(31-26) 


This equation is the initial condition for the exact function which 

is an approximate multiple of 6i{r,Ek), and so an approximate solution of 
^o<fl = EkCH- An exact evaluation of G and the desired integral is 
unnecessary since the law of force is known only in a qualitative manner, 
and the model itself is only a crude one. For the wpper nuclear-energy 
levels Ek the approximation of Sec. 21i is appropriate and yields 




ja„|“ 

2 Jo p 


fuaiEk)’ 


(31-27) 


where w{Ek) is the classical radial-vibration frequency inside the crater. 
Thus we obtain the value u{Ek)/(l + e®*) for X in exact agreement with 
the von Laue result (31-21). 



192 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 


In the case of the normal state of the nucleus, in which we are chiefly 
interested, the foregoing approximation is unsatisfactory although correct 
as regards order of magnitude. For this lowest energy level Eq the function 
G will have no nodes between the origin and r 2 . Since it starts off at the 
origin as a sine curve with amplitude |au||p(0,J?o)~^^i, it is reasonable to 
evaluate the integral, treating G as a single arcb of a sine curve with the 
above amplitude and a phase 37r/4 at r = ri. On this hypothesis 

i\G\Hr = 2ri|aw|2/3p(0,jfiJo) and X has the value 

It is of interest to note that the von Lane value of the decay constant 
when combined with the estimated energy uncjertainty of a weakly 
quantized state gives an interesting illustration of the Heisenberg uncer- 
tainty principle. In a time 1/X the probability that an initially undis- 
integrated nucleus will have disintegrated is reduced to 1/e. Hence 
we can say that the uncertainty in the lifetime of an undisint(^grated 
state is of the order of magnitude of 1/X. As the product of 1/X and 
^Ekj using (31-21) and (31-16), is X/47r with an error of the order of the 
small quantity we see that we have to do with an almost ideal case in 
which the uncertainty in energy and time has a minimum product. 

*31f. Complex Eigenvalues. — On account of its mathematical inter- 
est, mention should be made here of the elegant method of complex 
eigenvalues applied to the decay problem by Gamow.^ The method 
starts from the observation that the experimental law of radioactive 
decay would be exactly described by a wave function of the form 


^ = 


U , 2riEt 

(H(r)e 2 h 
r 


(31-28) 


A function of this type will satisfy the second Schrodinger equation, if 
(R(r) is a solution of the radial equation 


i¥o(R = (E — ia)(H. 



(31-29) 


This latter equation with the suggested complex value of the energy 
parameter is of course a perfectly good differential equation and has the 
same sort of manifold of solutions as the corresponding equation for a 
real parameter. The solutions can be studied by the B. W. K. method 
as before. 

In this case the function p{x, E — ia) =*= \/2fi{E — ia — V) is 
complex on both sides of the potential barrier as well as in the region II 
under the mountain.^' However, if we take into account the experi- 
mental values of the decay constant X and the energy of the escaping 
alpha particles, we must assume that in all practical cases the imaginary 
^ cU.f footnote 2, p. 179. 



Sec. 31] 


WEAK QUANTIZATION 


193 


part of p is extremely small close to the zeros of its real part. Consider 
the case of thorium emanation, the atoms of which have an average 
life of the order of 1 min. X has the value 1.27 X sec.”^ while the 
energy is 10~^ ergs. Hence a/E = Kk/AirE = 6.6 X 10“^^ We can 
acjcordingly neglect in comparison with {E ~ thus obtaining the 
approximation 

P = Po-^-~; po = [2^{E - F)]H (31-30) 

Vo 

Wo identify Vi and r 2 in this case with the zeros of po. In the region III 
the B. W. K. ai)proximations fu and /„ become 



(31-31) 


As po is real and positive in III by our conventions, |/u| increases without 
limit as r increases, while |/„| decreases and approaches zero at r = oo. 
The current theorem of Eq. (31-23) is still valid and the currents of the 
fu and fv approximations are sensibly the same as before, except for an 
additional multiplying factor due to the contribution of a to the variation 
in the amplitudes with r. The important point for us to note is that fu 
still represents an outward flow of current and fv an inward one. 

Any solution of (3T29) yields a corresponding function ^ given by 
Eq. (3T28) which can be interpreted as the superposition of two decaying 
streams of alpha particles, one radiating from the origin and the other 
converging toward it. These currents can be balaiuicd at the origin if the 
absolute value of the fu component of (R is equal to the absolute value 
of the/v component at the same point. This condition can be satisfied for 
any value of the complex number E — ia. In general, however, a 
solution (R(ry E — ia) will yield converging as well as diverging currents 
in III and thus will differ radically from the form needed for the representa- 
tion of an assemblage of disintegrating nuclei. There exist solutions of 
(31-29), however, which give a net current zero at the origin and also 
take the asymptotic form aufu{r, E — ia) in III. A necessary condition 
that CR shall be of this type is that the current reflected by the crater shall 
differ from the diverging current incident on the crater by an amount 
equal to the transmitted current. But by using a small positive value 
of a we provide for a slight increase in the diverging current between the 
origin and the crater and also a slight decrease in the converging current 
between the same pair of points. For just one value of a the current dif- 
ference at the crater will just balance the leakage due to the tunnel effect. 
This balancing of currents is insufficient, however, to insure that there 



194 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


shall be no converging stream outside the hill. For that purpose it is 
necessary that (R shall have a definite phase at the tunnel entrance ri. 
When this phase is properly chosen the phase at the origin will not in 
general make the origin a nodal point for (R. In order to keep ^ from 
becoming infinite at the origin we must then (dioose the real energy E 
in a special way so that the phase requirement at the crater will be 
compatible with the requirement that (R vanish at the origin. Owing 
to the very small transmission coefficient, a is very small, and hence the 
energies Eh which conform to the above phase requirement are very 
nearly the same as the energies Eh of our preceding discussion. The 
successive values of Eh — iah for the different resonaiKJC points on the 
energy scale are called complex eigenvalues of the energy, although 
they have not at all the same significance as real discrete eigenvalues. 

In the neighborhood of the origin the eigenfunctions (R(r, Eh — ietk) 
conform to the requirements for the initial weakly quantized state of an 
assemblage of predisintegrated nuclei. Moreover, the corresponding 
complete wave function decays in time according to the expoiuaitial 
law found by experiment. Hence Gamow assumes the legitimacy of 
identif 3 dng the decay constant of an observed nuclear disintegration 
process with the decay constant X* computed by means of (31*29) from 
a corresponding complex eigenvalue Eh — ia*'. 

Let us grant for the moment the legitimacy of the Gamow hypothesis. 
We can then identify Eh with Eh with sufficient precision to permit an 
accurate estimate of a*'. Equation (31*24) is valid if we identify G 

2iri 

with (R(r, Ek — iak)e * *“* Hence 


X = 


Airui 


dr ^ 


Jr— 


(31*32) 


In evaluating X by this formula we ought, strictly speaking, to use the 
eigenfunction (R(r, Eh — ioLh)f but on account of the extreme smallness 
of ah it is evident that there will be no appreciable error if we substitute 
for this eigenfunction the function (R(r,£^fc) used in Sec. 316. Thus 
we come back in first approximation to the result previously derived 
from an altogether different point of view. Higher approximations 
could be obtained but are not worth while. 

In view of the agreement with the result obtained in Sec. 316 and that 
derived by the perturbation-theory method the legitimacy of the Gamow 
hypothesis can hardly be questioned. Unfortunately, however, a 
completely satisfactory justification of the method independent of any 
alternative method of approach is lacking. To be sure we can create a 



Sec. 32] PROPERTIES OF SOLUTIONS OF M AN Y-^P ARTICLE PROBLEM 195 


quadratically iiit(‘grablo wave function representing a weakly quantized 
initial state by multiplying (H(r, Eu ~ ioLk) by a modulating factor, >say 
6“^", and can then justify (31*32) by the same type of heuristic argu- 
ment as we have used in Sec. 316. The problem of placing these heuristic 
arguments on a rigorous basis has not been solved as yet, however. 

In concluding this section we call the reader^s attention to two 
other important examples of weak quantization.^ The so-called pre- 
dissociated molecular states are imperfectly quantized excited states of 
molecul(\s having energies greater than the minimum dissociation energy 
of the molecule and having a finite lifetime due to spontaneous transitions 
to the dissociated states. The energy levels which give rise to X-ray line 
spectra are the energies of weakly quantized ionic states capable of a 
second spontaneous ionization in which the energy set free by an outer 
electron dropping down into a vacant space in an ^4nner shelF’ is used 
to eject a second outer electron. This type of process, called the 
Auger effect^ will be the subject of further discussion in Sec. Z2h. 

Radiative transitions can take place between different imperfectly 
quantized states and between imperfectly quantized states and sharply 
quantized states. The diffuseness of the initial or final energy level, 
as the case may be, then produces a broadening of the spectrum line 
emitted or absorbed and referable ultimately to the finite lifetime of one 
or both of the associated energy levels. A measurable broadening of 
this type is found in many band-spectrum lines having predissociated 
initial or final states and in many X-ray lines, especially those of longer 
wave length. 

In a sense all atomic and molecular energy levels except normal states 
and mctastable states — and the latter form a doubtful exception — are 
imperfectly quantized, for the existence of spontaneous radiative transi- 
tions from an upper energy level to any lower level implies a finite lifetime 
for the former. The broadening of spectrum lines produced in this way 
by the emission of radiation itself is swamped in most cases, however, by 
broadening due to other causes such as collisions, Doppler effect, etc. 

32. THE EXISTENCE AND PROPERTIES OF SOLUTIONS OF THE MANY- 
PARTICLE SCHR6DINGER EIGENVALUE-EIGENFUNCTION PROBLEM 

32a. Introduction. — In the work of Secs. 28 to 30 we have used the 
method of the separation of variables to demonstrate the existence 
and properties of solutions of the Schrodinger eigenvalue-eigenfunction 
problem for two oppositely charged interacting particles. The separation 
of variables in effect resolves 'the three-dimensional two-particle problem 

^ C/. 0. K. Rice, Phys, Rev. 34, 1461 (1929), 36, 1638 (1930) for a theoretical dis- 
cussion of predissociated molecules with references to the literature. For theory of the 
Auger effect e.g-^ O. Wbntzbl, Zeits, /. Physik 43, 524 (1927), Physik. Zeit^. 29, 
333 (1928); E. Fues, Zeits. f. Physik 48, 726 (1927). 



196 CONTINUOUS SPECTRUM; MANY-PA RTICLE PROBLEM (Chap. VI 


into three Sturm-Liouville problems and permits us to use tlie highly 
developed Sturm-Liouville theory in this more general case. The wave 
equation for two particles with like charges can be separated in the same 
way, although the radial equation has no discrete eigenvalues. The 
method is inapplicable, on the other hand, to the more complicated 
problems of atomic and molecular structure involving three or more 
interacting particles, and, in fact, the existence of solutions of these 
problems has yet to be proved. 

The basic differential equation for the problem of n = / + 1 particles 

is (17*1). Let Tk denote the radius vector from particle/ + 1 to particle fc, 
and let xjb, yk^ Zk denote its components. Let Vjk denote the absolute 
value of the distance from particle j to parti(*le k. Assuming that all 
particles are subject to interactions of the Coulomb or electrostatic 
type,^ we throw Eq. (17T) into the form 





(32-1) 


If this equation is correct, it should have quadratically integrable 
eigenfunctions corresponding to the stable states of atoms, molecules, and 
ions revealed by experiment. Moreover, an essentially arbitrary func- 
tion of the coordinates representing the instantaneous form of a wave 
packet should be capable of expansion into a discrete-continuous linear 
combination of type A and type B eigenfunctions. In Secs. 19 and 23 
we have developed existence theorems for quadratically integrable 
eigenfunctions in one dimension, but no corresponding existence theorem 
for the many-dimensional case has been constructed. In fact when we 
paas from the two-particle problem to the many-particle problem of 
(32-1) we pass from a domain in which there is a well-developed basic 
mathematical theory to a domain of mathematical ignorance. It 
becomes necessary to assume both the existence and basic properties of 
the discrete- and continuous-spectrum eigenfunctions. These assump- 
tions can be made out of hand, or they can be partially justified by 
plausibility considerations of a mathematical character. We adopt the 

^ An assumption which must be discarded when a satisfactory relativistic forma- 
tion of the quantum theory is finally constructed. 



Sec. 32] PROPERTIES OF SOLUTIONS OF M AN Y-P ARTICLE PROBLEM 197 

second procedure, which at least exhibits the difficulties in the nature of a 
satisfactory general theory if it does not overcome them. 

It will be of comfort to the reader of the remainder of this section to 
not(i that we have some evidence of the existence of discrete eigenfunc- 
tions for the simpler many-dimensional problems through the success of 
attempts to work them out and locate the eigenvalues by successive 
approximations. We have no proof of the ultimate convergence of these 
approximations, but in the best work of this kind on the helium atom^ 
and the hydrogen molecule,^ for example, the computed approximate 
energies appear to convcu’ge in a quite satisfactory manner upon values 
very close to the experimental ones. 

32b. New Boundary Conditions for Physically Admissible Wave 
Functions. — Before attempting a formal extension of the general theory 
of one-dimensional (ugenvalue problems to many dimensions we pause to 
reconsider the boundary-continuity conditions for physically admissible 
wave furKjtions. In laying down our preliminary definition of physically 
admissible wave functions (type A functions) in Sec. 17 we were primarily 
concerned with the selection of the discrete eigenfunctions of 7/^ = Eyp 
from the totality of the solutions of this equation. We should like, 
however, to define a class of physically admissible” functions which 
shall include the above mentioned eigenfunctions and which are to be 
admitted as descriptions of the most general subjective states, or of corre- 
sponding assemblages of identical systems prepared so as to be in a com- 
mon state. We shall have much to do with the business of expanding 
such functions in terms of the eigenfunctions of various operators, and 
it will reduce our worries if we can assume that all allowed ^ functions 
satisfy rather stringent boundary-continuity conditions. 

As the solutions of the hydrogenic-atom problem worked out in 
Sec. 29 are particularly well behaved both at the origin and at infinity, 
the question arises whether we cannot require that all physically admissi- 
ble wave functions shall vshare these desirable features. Such a restriction 
is evidently permissible provided that the discrete eigenfunctions of the 
many-particle problem have the same characteristics as those of the two- 
particle problem, and provided that the proposed restrictions on physi- 
cally admissible wave fimctions leave a class broad enough to describe 
any experimentally realizable physical state. 

Clearly the best way to answer our question is to begin by defining 
the proposed restrictions, reserving for later discussion the question of 
their validity. We accordingly lay down a series of five conditions 
Di, Z) 2 , Z>8, 2 > 4 , Dh and designate a function which conforms to all of them 
as a function of class or type D, The phrase ‘'physically admissible” 
will be interpreted for the present as synonymous with ‘Hype D.” In 

^ E. A. Hyll|jbaa8, Zeits, f. Phyaik M, 347 (1929). 

* H. L. James and A. S. Coolidgb, J. Chem, Phya. 1, 826 (1938). 



1 98 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


Sec. '426 we shall add a further restriction to the manifold of physically 
admissible wave functions not included in the definition of D, The 
conditions are defined with reference to a particular Hamiltonian opera- 
tor, so that each Hamiltonian generates its own class Z>, and are directly 
applicable in general only when the functions are expressed in terms of 
Cartesian coordinates. They are also limited to Hamiltonians of the 
usual nonrelativistic type with no other singularities than those produced 
by simple Coulomb type poles in the potential function [c/. Eq. (32*1)].^ 
The explicit formulation of the D conditions is as follows. 

Di. — Every ^(xi, , z/) in class D is single-valued and ana- 

lytic in all the variables at every point where the potential energy 
V(xiy • • • , 2/) is analytic. In other words ^ is analytic at every finite 
point of configuration space that does not bring two charged particles 
together. 

D 2 * — ^ shall vanish at infinity faster than any ne^gative power of the 
coordinates. This requirement which is very simple in the one-dimen- 
sional case can be formulated more precisely as follows: Let Q denote a 
point in the coordinate space and let P be any polynomial of the coordi- 
nates Xij ‘ f Zf. Then, if Q moves out to infinity along any path 
involving no singular points, 

\imP{Q)rP{Q) -0. 

Q~-¥ op 

This condition is assumed to apply as well to the first and second deriva- 
tives of ^ with respect to the coordinates of the configuration space. 

D 3 . — ^ and its first derivatives shall be absolutely and quadratically 
integrable over the whole configuration space. ^ 

D 4 . — In order to insure that the probability current to the singular 
domain r</ == 0 shall vanish, it is necessary to introduce a condition best 
^ This limitation in applicability is necessary because the discrete eigenfunctions 
of an equation of the form = Eif^ always have **built singularities fitted to the 
singularities in H. These are of an entirely different character in the relativistic 
and nonrelativistic theories of the hydrogen atom. 

In the case of a negative inverse-square potential the radial equation of the two- 
particle problem has no proper set of eigenfunctions at all. In this case none 
of the solutions of the radial equation satisfies the singular-point boundary condi- 
tions. The general problem of the inverse-square potential has been discussed 
by Shortley, Phys. Rev. 38, 120 (1931) but without reference to the smgular- 
point boundary conditions. The difficulties to which it leads are of no real physical 
importance since the inverse-square potential does not exist in nature. In fact we 
must regard even the Hamiltonian of (32 1) with its Coulomb potential terms as a 
useful and convenient approximation to the true Hamiltonian operator rather than 
anything of absolut'e significance. 

* This evidently implies that the product of ^(xi, • • • , «/) or ^ 0* « L 2, • • •) 

with any polynomial Pn{xu * • * > «/) Is also absolutely and quadratically integrable 
since these polynomials are analytic and bounded in every finite domain. The 
rapid decrease of and its derivatives take care of the convergence al infinity. 



Skc. 32) PROPERTIES OF SOLUTIONS OF MANY-P ARTICLE PROBLEM 199 


described in terms of a new set of coordinates. In place of the coordinates 
Xi, Vi, Zi, Xj, yj, zj, we introduce the coordinates f, rj, f of the center of grav- 
ity of particles i and j together with the spherical relative coordinates 
Tij, Oij, <pij. Let 4>(ri;, 6^, tpi,) denote the integral 

• -J* 

in which dr' is the product of the differentials of all the Cartesian coordi- 
nates except Xi, Pi, Zi, Xj, z,-. The quadratic integrability of ip insures 
the convergence of the integral. Let A denote the si)here = a in the 
three-dimensional space of ri„ and let TT„ dc^note the mean value 

of for the sphere A . Thus 


r 


Similarly wo designato by W 


tho oorn^spoiuling moan with d\l//dri 


substitutod for }[/ throughout. 

We now rocpiire that if ^ is in class D there shall exist a positive real 


n / 

number € and a real number m such that ^ 

I wr ij 

are bounded in the neighborhood of a = 0. 

Db. — I f ^ is in Df the conditions Di to Da shall apply to as well 
as to ^ itself. Here //" denotes the n-fold application of the Hamil- 


tonian operator H, 


The series 


00 

2 ( 


27rzY\ 

■-r) 


IS assumed to converge 


for all values of t in the neighborhood of every point Xi, • • • , 2 / and 
must yield a function which is also of class 1), 

The reader will readily verify that these conditions are satisfied 
by the discrete eigenfunctions of the hydrogenic-atom problem and by 
their linear combinations. Hence there is an infinite multiplicity of 

h 

class D solutions of the Schrodinger equation 

problem. The existence of class D functions for the Hamiltonian of a 
many-particle problem has not been proved but is a plausible assumption 
to be considered below. 

We proceed to a brief discussion of the successive items which con- 
stitute the definition of class D. 

Di is an extension of the continuity condition for type A and type B 
functions. It seems probable that all bounded solutions of the Schrod- 

h dS^ 

inger equation ~ "ST analytic in t and in the space coordi- 



200 CONTINUOUS SPECTRUM; M AN Y-^P ARTICLE PROBLEM [Chap. VI 


nates except on singular domains of the operator H, At any rate this 
is true of all the familiar solutions of this equation. We wish every 
‘^physically admissible'^ function to be a po^/ential initial state for a 
physically admissible solution of the above equation and accordingly 
introduce the condition Di. The condition is convenient, and, in the 
writer's opinion, unobjectionable, whether necessary or not. 

As regards the adequacy of analytic functions for the description 
of experimental situations, it may be observed that, although the idealiza- 
tion of physical experiments may lead to situations requiring nonanalytic 
functions for their description, this is evidently not true for the actual 
experiments themselves. For example, the exact location of an electron 
in space could be described in the language of wave mechanics only by a 
function — the Dirac b function — so discontinuous that it does not exist! 
But when we consider the inevitable experimental error we see that the 
result of an actual positional observation is to give us knowledge about 
the electron representable by an analytic function which conforms 
roughly to a condition of the form 

j'^|2 —s — «[(* — aJo)2-+'(2/— 

In general the interaction of an atomic system with a piece of apparatus 
conceived of as classical may yield exact information representable 
only by nonanalytic waves. However, when our imperfect knowledge 
of the state of the actual instruments is taken into account, we see that 
the need of nonanalytic functions is apparent rather than real. 

The condition goes beyond the mere requirement of quadratic 
integrability for type A functions. Together with Dz it imposes the 
severest restrictions on the behavior of type D functions at infinity, 
roughly equivalent to saying that these functions must approach zero 
exponentially for large values of the coordinates. The requirement of 
absolute integrability is one which we have already seen to be very 
convenient in connection with the determination of expansion coefiicients 
where a continuous spectrum is involved (c/. pp. 36 and 171). 

Da is analogous to the s.p.b.c. of Sec. 23d and is formulated for the same 
purpose. Its use permits us to identify Class D with the class of per- 
missible comparison functions for a variational formulation of the problem 
of locating the discrete eigenvalues of H with their eigenfunctions. 

Dfi is designed to insure that the second Schrodinger equation 
h 

— shall transform physically admissible wave functions into 

new physically admissible wave functions through the passage of time. 
As previously stated, it seems probable that all bounded solutions of 
this equation are analytic in the time like the product form functions 

^ *= ’^ix,E)€ ^ and their linear combinations. We assume that such 



Sec. 32] PROPERTIES OF SOLUTIONS OF MANY-PARTICLE PROBLEM 201 

is the case in order to validate our proof (c/. footnote 2, p. 18) that 
^ is uniquely determined for all time by the Schrodinger equation and 
its form at some arbitrary initial instant. It follows from the equation 
itself that 

Thus the series specified in Dg is the Taylor^s series expansion of ^ in 
powers of t which must converge for small values of f if ^ is analytic in t 
as well as in the space variables. 

Let us define the operators e * ^ by means of the formal power- 

series expansion for e* If is analytic in * 


n = 0 

If is a solution of the second Schrodinger equation this becomes^ 

2irt 

Hence e ^ is sometimes called the time-displacement operator. By 
applying this operator to the initial function \(/(x) we obtain a correspond- 
ing solution of the second Schrodinger equation. requires that this 
operator shall always transform class D functions into class D functions. 

An important and readily verified general property of the class D 
functions is that they form a linear manifold. That is, if any two 
functions ^2 conform to the D conditions, an arbitrary linear combina- 
tion will also conform to these conditions. This is the principle of the 
‘^addition of states” emphasized by Dirac. ^ 

*32c. Approximating Arbitrary Quadratically Integrable Functions 
by Means of Class D Functions. — For the special case of three-dimen- 
sional functions where class D is defined with reference to the Hamiltonian 




(32-2) 


it is possible to prove another important general property of that class, 
viz: For every quadratically integrable function fix^y^z), whether it belongs 
to class Dj or not, and for every positive constant e, however small, there 

exists a class D function }p{x,y,z) such that s. 1/ ““ yl/\^dxdydz < €. In 

the language of von Neumann, the manifold D is '^everywhere dense” 

1 Cf. VON Nbtjmann, M.G.Q.f p. 108. 

* Dirac, P.Q.M., section 7. 



202 CONTINUOUS SPECTRUM; M AN Y-.P ARTICLE PROBLEM [Chap. VI 

in the Hilbert spare of all quadratically integrable functions of x, y, 
and 

To prove this proposition we make use of the eigenfunctions of the 
equation 


vV + i = 0. (32-3) 

This is simply a central-force-field problem with a potential function 
of the Coulomb type at the origin, but becoming infinite at infinity. 
Solving by the method of the separation of variables, we get a three- 
dimensional array of eigenfunctions 


which are of class D with respect to the Hamiltonians of Eqs. (32*2) and 
(32-3). Each of the three one-dimensional equations into which the 
original three-dimensional equation is resolved has a complete set of 
discrete eigenfunctions. It follows^ that the product functions form a 
complete set in the sense that 


X.M.» 


f*\ +x 

1 w! + /* 

l-\-p 

1 

M 

2 

^'nlrn^nlm 

/ 1 e= . X 


n “i-fl 


r2 sin ddrdddip = 0, 

Cnlm ~ 


for every quadratically integrao*e function Then for every / 

and every € we can choose X, y, v large enough so that, if 


-fx jwl+M l-\-v 

^ X X 

m«— X l^\m\ 

the inequality ^J\J — ^{Hxdydz < e will hold* 

In the general case of a many-particle problem with one of the basic 
Hamiltonians of Eqs. (7*2), (7-3), and (32’1), it is not possible to force a 
separation of variables without dropping the singular domains r*., = 0. 
Hence the above type of proof fails, but there can be little doubt of the 
validity of the theorem in this general case and we assume it as a postu- 
late. This proposition is the basis of our claim that class D is broad 
enough for the x)urposes of quantum mechanics. 

32d. Hermitian Character of the Hamiltonian Operator. — It is 
desirable at this stage in the development of the theory to replace the 
definitions of the adjoint to a one-dimensional second-order differential 
operator, and of the self-adjoint property, by new definitions of broader 
scope applicable to linear operators which are not necessarily of 

‘ (?/. VON Neumann, ilf .O.Q., p. 23 . 

•C/. Coubant-Hilbbbt, M,M*P.. Kao. IT, §1, 6. 



Sec. 32| PROPERTIES OF SOLUTIONS OF M AN Y-^P ARTICLE PROBLEM 203 


differential character. These terms will be used hereafter only in the 
new sense. 

Definition:^ Let C denote a linear manifold of functions defined and 
quadratically integrable over a domain M of coordinate space. Let 0 and 
0^ denote two linear operators which yield quadratically integrable trans-^ 
forms Qyp and OV when applied to any function \l/ of the manifold C. If 

(^1,0^2) = (0^i,^2) (32-4) 

for every pair of functions yj/x, which belong to C when the domain of 
integration is extended over AT, the operator 0+ is said to be adjoint to 0 
with respect to the manifold C and the domain M. 

It follows as a corollary that if 0+ is adjoint to 0, th(^ latter operator 
must be adjoint to 0+. 

Definition: If the operator 0^ is adjoint to itself {self-adjoint) with 
respect to a manifold C and a domain My it is said to be Ilermitian with 
respect to C and M. The relation 

(^1,0^2) = (0^1, ^2) (32-5) 

must then hold for every pair of functions ^ 1 , yp^ in C when the integration 
is exteruled over M, 

Corollary: If the linear operators a and are both H('rmitian with 
respect to a manifold C and a domain My a — ifi is adjoint to a + ifi 
with respect to the manifold C and the domain M. 

Corollary: A Sturjii-Liouville oj^erator 

* ■ 

is Hermitian with refen'nce to an interval a ^ x ^ b in which A has no 
singular points and to any class of functions which are twice differentiable 

^ This and the following definition may he compared wdth the corresponding defini- 
tions as laid down by von Neumann, M.G,Q.y pp. 48, 50. von Neumann makes no 
mention of the manifold C and domain M in defining either the adjoint operator Ot 
or the Hermitian property. This is because he includes in the definition of each 
operator a linear manifold of functions (or of elements of abstract Hilbert space) on 
which it may operate, and for whi<;h its transforms are required to be quadratically 
integrable over a fundamental domain M (or to be new^ elements of abstract Hilbert 
space). We have defined an operator as merely a rule for transforming one function 
into another. It follows that what we call a single operator can give rise to a multi- 
plicity of operators in the von Neumann sense by a multiplicity of choices of the mani- 
fold of functions on which they are allowed to operate. 

It w ill be observed that in the case of a second-order differential operator in one 
dimension, if is adjoint to A in the sense of the definition given in footnote 1, 
p. 122, it is also adjoint to A in the sense of the new definition with respect to the 
domain a < x <h and any linear manifold of functions quadratically integrable in 
that interval and so defined as to make G(a) = O^h). 



204 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


in the interval and conform to homogeneous boundary conditions at a 
and 6. 

Corollary: Ji Sturm-Liouville operator A is Hermitian with reference 
to an interval a < x < b bounded by singular points, but having no such 
points in its interior, and to the class of functions which are twice differ- 
entiable in the interval and conform to the singular-point boundary 
condition at a and at b. 

Theorem: The basic Hamiltonian operator of the many-particle problem^ 

viz,, 


n n — 1 n 



is Hermitian with reference to the domain consisting of all coordinate space 
and to the functions of class 


To prove the theorem we first introduce the 3w-dimensional vector B with com- 
ponents Uiy vit Wit U 2 t V 2 t ' ‘ ' Wn aloiig the axes xu yi, 2 i, • • • Zny where 



Then 


- S (£ + s + S)- 

1 

Let us next form the integral 

«f<7 = / • • • — ^iH}l/2*]dxidyi • • • dzn 


over a finite region G of coordinate space so chosen that it includes no singular points. 
Employing the method of Gauss we transform Jo into the surface integral of the 

normal component of B along the outward normal to the (/ — l)-dimensional hyper- 
surface of G. Let G be bounded by a set of planes perpendicular to the coordinate 
axes and by hypercylinders 


r*;* s {Xk ~ xi)^ -f {yk - ViY + {Zk - ziY =• a* 


to shut out the singular domains for which r*, *= 0. 

If the boundary planes are nowg moved out to infinity at the same time that the 
radii of the hypercylinders approach zero, the integral Ja approaches 

^It is not difficult to extend this theorem to the Hamiltonian of Eq, (7*8) with 

suitable restrictions on the vector potential <t. The essential features of the required 
integral transfomation for the three-dimensional case are indicated on p. 31. 



Sec. 32] PROPERTIES OF SOLUTIONS OF MANY-PARTICLE PROBLEM 205 


as a limit. If we can show that under these <;ircunistances the surfa(!e integral 
approaches zero, provided that and \l /2 belong to class D, we shall thereby prove 
our theorem. 

It suffices to prove that the integral of the absolute value of the normal com- 
ponent of B over the whole of each of these surfaces approaches zero in the limit. 
Consider first the contribution of the typi(;al cylinder rjt; = a. The normal com- 
ponent of B is 


Bn 


“ Uk 


dXk 


4- VkT;— +Wk- 7 r- 4- Uj—' 

Oyk azk <)Xj 




87r2 


MA-M/ L drkj J 


(32-8) 


The right-hand member of (32*8) assuim^s that and ^2 are expressed in terms of the 
coordinates 17, r*;, Gkh <Pki introduced in connection with the boundary condition 

♦i>4 (p. 199). 

The surface integral of Bn over the hypercylinder Skj for which tk) ~ o is equal to 
the difference of the integrals 


hiia) = • • • f. s\ne^id0k,d‘Pkidi^r,didr', 

Stt-* |Xk^li J JSki tifkj 

Jkjifl) ~ Q 2 ^ -d^f* • • • v^i~“ ' sin 04*/ 

ex'* fikMj J JkSkj oTkj 


From the inequality of Schwarz, Eq. (22-20), it follows that 
I ilk 4- M/ 1 LdrA,- J 

In view of the condition we conclude from the above inequality that lirn hi = 0 

tt“»0 

if rpi and ^2 belong to 7). As the same argument applies to /jty, the contribution of 
Ski — and hence of each of the hypercylinders — to the surface integral approaches zero 
as its radius approaches zero. 

Consider next the integral of the absolute value of the normal component of B 
over the surface of the box made up of the plane boundaries of G. The integral is the 
product of the mean value of Bn into the area. The latter is proportional to the 
(3n — l)th power of the linear dimension of the box and it follows from that if we 
expand the box in all directions at the same rate the value of Bn wdll approach zero 
more rapidly than any finite inverse power of the linear dimension. Hence the 
integral over the plane boundary surfaces of G also approaches zero as G is expanded 
to include all coordinate space. It follov/s that II is Hermitian with respect to coordi- 
nate space and the functions of class D. It is, in fact, Hermitian with respect to a 
much wider class than Z), for we have not used all the class D conditions in this proof. 

In Sec. 15 we separated the variables of the Schrodinger equation by 
introducing relative coordinates in place of the “absolute” coordinates of 
Eqs. (7-2) and (7-3). The transformation there employed resolves the 
complete Hamiltonian operator H of Eq. (32-6) into the sum of two 
operators Hr and H, giving the energy of the relative motion (internal 
energy) and the energy of the motion of the center of mass, respectively. 
The operator 



206 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 

is obviously Hermitian with respect to class Z> functions in absolute 
coordinate space and also to class D functions in the three-dimensional 
space XjYyZ. It follows at once that the internal-energy operator Hr 
of Eq. (32-1) is Hermitian with respect to class D functions in absolute 
coordinate space. It is also Hermitian with respect to class D functions 
in the space of the relative coordinates since we have only to multiply 
each function of this type by a class D function of X, Y, Z in order to 
obtain a class D function in absolute coordinate space. 

Theorem: The scalar product (^ 1 ,^ 2 ) of any two class D solutions of the 
second Schrodinger equation 

(32-10) 

is constant in time. The theorem is a direct consequence of the Hermitian 
character of H. Thus 

- 

* 0. (3211) 

Corollary: The normalization integral (^,^) for any class D solution 
of Eq. (32- 10) is constant in time, as required by Sec. 8. 

Theorem: The eigenvalues of Eq. (32*1) for class D eigenfunctions are 
all real (cf. Sec. 23). As in the one-dimensional case we have 

= {E* - E)ii,f) = 0. 

Since (^,^) cannot vanish unless ^ is identically zero, it follows that 
E* = E. 

Theorem: Any two class D eigenfunctions of Eq. (32*1), say ^ 1 , ^ 2 , 
having different energies Eiy E 2 must he mutually orthogonal. Thus 

(^l,Hr^2) = (^2 ““ -El)(^l,^2) 0. 

32e. Redtictioii of tho Eigenvalue-eigenftmction Problem for Discrete 
Spectra to Variational Form. — Consider the solutions of the variational 
equation 

SdHr - - E\Hr = 0, (32-12) 

with comparison functions subject to the boundaiy-^ontinuity conditions 
D. 

Si[Hr - E\^,4>) = ([Hr - + ([Hr - 

Since all comparison functions are of class D, 4^ and 64> must be of class D. 



Sec. 32] PROPERTIES OP SOLUTIONS OF MANY-PARTICLE PROBLEM 207 

But Hr is Hermitiaii with respect to class />. Hence we obtain the 
reduction 


bi[Hr - = real part 2 - E]^dr = 0. (32-13) 

In view of the theorem of Sec. 32c the functions 5^ are sufficiently arbi- 
trary so that (32*13) can hold only when ^ is a class D solution of 

Hr^ = E\l/. 

The extremals of the variational problem are the eigenfunctions of the 
differential equation and vice versa. 

The above formulation of the variational problem parallels the 
formulation A of Sec. 24. It is easy to give other formulations parallel 
to the schemes B and C of Sec. 24. Thus the eigenvalues and eigen- 
functions of Eq. (32*1) are the stationary values of Q/N and the extremals 
which yield those values, respectively, if Q and N are defined by 

Q s N ^ (^,^). (32*14) 

The integral Q when applied to class D functions is readily reduced to the 
alternative form 



32f. A Lower Bound for the Energy Integral. — The contribution of 
the kinetic-energy terms in the integrand, f.e., terms in the derivatives 
of is essentially positive as is the contribution of the mutual potential 
energy of pairs of particles carrying charges of like sign. Hence it is 
possible to show that in the case of an atom or ion with a single positively 
charged nucleus Q has a lower bound. We identify the nucleus with the 
particle / + 1 and introduce the integral Q defined by 


- /. Ir-Ssi""** '^1’ + 2' 

. ifc -1 *-1 




dr. (32-16) 


Q is obtained from Q by dropping positive terras in the integrand of 
(32-15). Hence 


Qm > 0 [^] 



208 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


for all functions It follows that Q[^]/iV[^] has a lower bound if 
such a bound exists for QbP]/ N[\l/]. But Q[\l/] is the energy integral for a 
modified Schrodinger problem involving / independent particles moving 
around a fixed center of force. The variables in this modified problem 
separate in appropriate coordinates yielding eigenfunctions 
which are products of hydrogenic wave functions. An arbitrary class D 
function of the coordinates a;i, • * • ,2/ can be expanded in terms of these 
eigenfunctions. If the expansion is given the form 

^ CkS^kh'^^^dEy 

k n k 

it follows from the completeness of the system of hydrogc^nic functions 
that 


QW ^ OlV'] 
Ar[^] ^ NM 


s 

k 

w ‘ 


2 

k 

n 

'M^dE 



Thus the minimum eigenvalue of the modified problem is also a lower 
bound for Q/N. 

The corresponding theorem for the general case of a system containing 
any number of positively and negatively charged particles does not go 
through so easily, but we shall assume that it is true. 

The conclusion that Q/N has a lower bound makes it reasonable to 
suppose that there is a function rpoixi, • • • ,2/) which actually minimizes 
Q/N but does not prove the existence of such a function. The existence 
of a lower bound for Q/N is an obvious necessary condition for the exist** 
,ence of the minimizing function but is not sufficient. This is proved by 
the consideration of the problem of the motion of a free particle in one 
dimension. If the potential energy is set equal to zero, the energy 
integral is essentially positive, but, although it is bounded below, it has no 
discrete eigenvalues and no minimizing function. 

*32g. Behavior of Solutions of the Differential Equation at Singular 
Domains. — Another necessary condition for the existence of a minimizing 
function \po is that the differential equation shall have solutions which 
conform to the class D boundary condition in the neighborhood of each of its 
singular domains. The absence of a minimizing function in the above 
mentioned one-dimensional case may be ascribed to the failure of this 
second necessary condition. 

We proceed to a preliminary examination of solutions of the differen- 
tial equation in the neighborhood df the singular domains for the special 
case of the helium atom. The system consists of two electrons and a 



Sbc. 32] PROPERTIES OF SOLUTIONS OF MANY-PARTWLE PROBLEM 209 


nucleus, which we designate as particles 1, 2, 3, respectively. The 
potential energy is 


V = _?£! + .!! 

rn Tit ri2 

The singular domains to be investigated are (1) the domains ris = 0, 
Via = 0, ri 2 = 0, representing two-i)ai’ticle collisions; (2) the point 
ris = r 23 = ri 2 = 0, representing a three-particle collision; and (3) the 
domain at infinity. Different coordinate sy.stenis are useful for studying 
the different singular domains. First, using tlu' sclunne nuaitioned in 
footnote 1, p. 64 wo write the differential capiation in the form 


with 


_?L 4- ^ 


rP + k(E - V)i = 


0 , 


(32-17) 


^1 = xi — xs; 

vi = yi — Va] 

fi = - za; 

_ MliMi + Ml) . 
+ M2 + Ms’ 


- ^i(‘ 


2/iMi + yysMs V 
Ml + M3 / 


f, - J-(>, - 

\M\ Ml + M3 / 


M 


_MiM2_ . 
Ml + M3^ 


Stt^M 

' /l2“ ’ 


If we make the approximation of treating the nuclear mass jus as infinite in 
comparison with the electronic mass, ^ 2 , 172, ^2 reduce to the x, y, z com- 
ponents of r 23 , respectively. Introducing spherical coordinates pi, 0i, 
in place of fi, 771 , f i, we reduce the equation to the form 

A + a + V 2 V + - F)^ = 0, 

pr api\ dpi/ pi^ 

where Ai and V 2 ^ are the operators 


Ai = 


- 


1 


siq Si dS 


02 


1 0 _ 
sin26'i 0^i2^ 


0^22 


-f _i u 

^ 07722 ^ 0f22 


If we now enter the differential equation with the expansion 


^ = ^pl^P'n{Si,ipi,^2jr}2ft2), 

n "“O 

we obtain a sequence of partial differential equations for the coefficients 
Fn{Si,<pi,it,yi,h)- There is no apparent difficulty about solving these 



210 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 


equations so long as we restrict the domain of the expansion to the region 
in which ris, Le., pi, is less than Thus we conclude that (32*17) has 
solutions which are finite on the singular domains ( 1 ). 

If the interelectronic energy term e^/r^ is neglected, the variables can 
be separated in Eq. (32*17), and solutions are obtainable in the form of the 
product of two three-dimensional hydrogcnic-atom eigenfunctions. 
These product eigenfunctions for the modified problem can be written 
as power series in the six-dimensional radius 

R = + fi" + ^2^ + V2^ + 

Hence we are led to investigate the behavior of sohitioris of the original 
problem near the singular point (2) by seeking solutions which are power 
series in R, Introducing six-dimensional spherical coordinates, defined 
by 

— R cos a sin di cos ^2 = R sin a sin cos ^ 2 , 

7 }i ^ R cos a sin 6 1 sin (pi, = R sin a sin $2 sin (p 2 , 
f 1 = R cos a cos 61 , ^2 = R sin a. cos dj 

we convert (32*17) into the form 

/iiHl) + - "■ 

where D is a Hermitian linear operator involving the angles a, 0i, 62 , <pi, (p 2 
and independent of R, Its explicit form is given in Eq. (35*14). The 
potential function V is of the form U/Ry where U depends only on 
a, 6>i, • • • , (p 2 . 

If a solution exists of the form 

f (32-19) 

n «-0 

it is necessary that the functions Gn shall belong to the Hermitian manifold 
of D and that in consequence P shall have the value zero. Unfortunately 
this substitution yields a sequence of equations for the successive 6r„'s 
which have not been solved and in all probability have no solutions in the 
desired Hermitian manifold. Thus one can say with considerable con- 
fidence that if eigenfunctions of (32*17) exist, the singular point at the 
origin lies outside the scope of (32*19). We have no answer, however, 
to the question whether, or not, there are solutions of (32*17) having a 
more complicated behavior, but still falling within the Hermitian manifold 
of the Hamiltonian.^ 

The disappointment of the above conclusion is somewhat relieved 
when we reflect that the Coulomb potential function used in our Hamih 

‘ The problem is under investigation by Prof. J. H. Bartlett to whom the author is 
indebted for valuable suggestions. See two articles on the helium-wave equatioh by 
T. H. Oronwali and J. H. Bartlett, Pkys. Rev. 51, 665, 661 (1937). 



Sec. 32] PROPERTIES OF SOLUTIONS OF MAN Y-PARTICLE PROBLEM 211 


tonian is not known a priori to be absolutely correct. It would be 
possible to invent a substitute potential function analytic over all 
C(X)rdinate space and yet so nearly equal to the Coulomb function, 
except in the immediate neighborhood of the domains (1) and (2), 
as to be experimentally indistinguishable from the latter. Eigenfunc- 
tions of a Schrodinger equation using such a modified potential function 
certainly do exist. Therefore an ultimate negative conclusion regarding 
our existence theorem would mean a slight modification of the Coulomb 
law rather than a fundamental change in the theory of wave mechanics. 
For the present it is convenient to assume what we cannot prove, viz,, 
the existence of solutions of (32T7) which do conform to the boundary 
condition 1>4 and are consequently adapted to the construction of 
eigenfunctions. 

A similar attack (^an be made on the singular points ri ~ 0, rg = 0, 
ria = 0, using coordinates appropriate to power-series expansions in ri, 
r* 2 , or ri 2 as the case may he, and with similar semisatisfactory results. 

Consider next the behavior of solutions of the differential equation 
(82*18) for very large values of li. Except in those singular directions 
along which one of the three terms in the potential energy is infinite, we 
can negl(H‘t Vyj/, The variables are then separable and the hypothesis 
that ^ = u{R)v{ai ‘ • <p<i) gives for every negative energy an infinity of 

approximate solutions of (32*17) which approach zero at infinity as 

e ^ and which therefore conform to the D conditions at infinity. 
Solutions of the form \l/ — uv for positive values of E are not quadratically 
integrabh'. We infer that the helium atom can have no discrete positive- 
energy eigenvalues, but that, so far as this [)ortion of the boundary is 
concerned, any negative energy gives solutions of the differential equation 
which conform to the class D boundary conditions. 

Thus we find essentially the same difference in the behavior of solu- 
tions near infinity for positive and negative energies as in the three- 
dimensional hydrogenic atom. One might readily jump at the conclusion 
that here again there is a discrete spe(‘trum of negative-energy eigenvalues 
which meets a continuous spectrum of positive energies at JE — 0. This 
conclusion must be wrong, however, for experimentally the continuous 
spectrum extends down to the energy of the normal state of the ion, 
i.e., to — 4i2A. An examination of the behavior of solutions for large 
values of r 2 and rn throws some light on the problem. 

In this domain we can neglect the potential-energy terms in l/r^ and 
1 /ri 2 , treating electron 2 as a free particle. The Schrodinger equation 
in the coordinates of (32-17) reduces to 

(V.2 4- + k^E -b ® 0. 


(32‘20) 



212 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM [Chap. VI 


It has solutions of the form ^ where and ^2 

are solutions of the equations 

ViVi + i{ei + = 0, (32-21) 

V2^^2 ”h K,i^E — i^i)^2 “ 0. (32*22) 

The first of this pair of ocpiations is that of the He+ ion, while the second 
is of the frco-particle type. 

In that portion of configuration space in which and e^/ri^ 

are much smaller than E, any solution of the initial Eq. (32*17), say 
■ * ’ ,f2,J^) must reduce to a linear combination of products of solu- 
tions 1//1, \{/2 of the above equations. If 4/ is of class D, or merely quadrati- 
cally integrable, each term of this linear combination must be quadratically 
integrable over that portion of configuration space in which the neglected 
terms of the potential energy are actually small. But this means that 
is quadratically integrable over all fi, t/i, space. In fact, if \p is of 
class 7), ^1 must be a (dass D eigenfunction of (32*21) and Ei a discrete 
(negative) energy level of the He+ ion. ^2 in turn must be quadratically 
integrable over all that portion of ^2, V 2 , f2 space in which r2 is large, 
a condition which can be met by giving E — Ei any negative value — as 
may be seen l)y introducing spherical coordinates. Thus it seems possible 
to find functions i/^(^j, • • * ,f2,-£?) for every negative value of E which, 
in the region where 6-/^2 and e^/r^ are small, conform to all boundary 
conditions appropriate to a disc^rete eigenfunction except those which 
apply at r’12 = 0 and r2 = 0. 

Consider next, however, the possibility of a type B eigenfunction 
(i.e., one which satisfies the D conditions at finite points but is not 
quadratically integral)le) of (32*17) with a negative energy E, If such an 
eigenfunction exists, it must also be expressible as a linear combination 
of products t/'i^2 in each of which \[/i must be either a class D or class B 
eigenfunction of (32*21) and ^2 must be bounded in the neighborhood of 
infinity. In the former case Ei is a helium-ion energy level as before, 
but E 2 must be positive. In the latter case Ei is positive, and if we 
make the approximation of treating the nuclear mass as infinite in com- 
parison with the electronic mass, we can show that E 2 must be a He+ 
energy level. ^ There is in either case the possibility of class B 
eigenfunctions for energies as low as that of the normal state of He+ 
which we designate as En'^, We are thus confronted with an apparent 
overlapping of the discrete and continuous spectra in the energy interval 
E< 0. 

1 Neglecting e^/r^, but keeping both of the other terms in the potential energy, 
one obtains a He/*” equation for each of the two electrons. The approximations are 
valid oyer enough of coordinate space to prove that if the energy of either electron 10 
negative it must be a discrete energy level. 



Sec. 32] PROPERTIES OF SOLUTIONS OF MANY-PARTICLE PROBLEM 213 

*32h. The Auger Effect . — Such an ovorlapi)ing can take place for a 
properly chosen Hamiltonian and does certainly occur in the case of a 
pair of electrons moving around a common center of force but having 
no interactions with each other. Moreover, we know by experiment 
that, in general, atoms and atomic ions do hav(^ relatively sharp emission 
and absorption lines associated with energy levels whi(‘h lie above the 
lower limit of the continuous energy-level spectrum, f.e., above the lowest 
ionization potential. These overlapping energy levels are apparently 
of the weakly quantized variety, however, atoms in such states passing 
spontaneously without emission or absorption of radiation into ionized 
(continuous-spectrum) states of the same energy. This pro(*ess, known 
as the Auger effect is entirely analogous to the spontaneous disintegration 
of atomic nuclei by the emission of alpha particles (c/. Sec. 31). It 
was discovered in the study of Wilson cloud chamber photographs of 
gases irradiated by X-rays. In addition to the long tracks of photo- 
electrons, shorter tracks were found whi(*ii could be accounted for only 
as the result of spontaneous transitions from excited singly ionized states 
(due to the removal of inner electrons) to states of ecpial energy consisting 
of a doubly ionized atom and an ejected ele(d-ron. More recently it has 
been shown^ that the great breadth of majiy X-ray lines is probably due 
in large part to the shortening of the lifetime of the associated energy 
levels by the Auger effect. There is also definite evidence of similar 
spontaneous ionization from the upper energy lev(*ls of the optical spectra 
of certain atoms. In the light of Sec. 31 we must interpret these results 
as proof that, strictly sp(3aking, the disiirete and continuous spectra of 
atoms do not ordinarily overlap, the energy levels from whi(^h spontaneous 
ionization takes place being associated with states winch are imperfectly 
quantized. ^ In other words, we associate the Auger effect with quad- 
ratically integrable functions which for certain fairly well defined energies 
yield approximate solutions of the first S<dirodinger equation and hence 
give rise to quasi-stationary states of finite, but relatively long, lifetime, 
capable of emitting and absorbing radiation like true discrete energy- 
level states. It may happen, of course, that the effect of the emission and 
absorption of radiation on the lifetime of an imperfectly quantized state 
which is neglected in setting up the basic equation (32T) is greater than 
the effect of spontaneous ionization. In that case a distinction between 
exactly quantized and imperfectly quantized states based on (32T) 
becomes somewhat academic. 

Returning from these general considerations to the helium-atom 
problem, we note that our examination of the domain in which r 2 and 
ri 2 are large is insufficient to prove the existence of any discrete energy 

1 P. AtroBH, Ann. de Physique 6, 183 (1926). 

*E. Rambbbg and F. K. Richtmybr, Phys. Rev. 47 , 644, 806 (1935). 



214 CONTINUOUS SPECTRUM; MANY-P ARTICLE PROBLEM (Chap. VI 

levels, to say nothing of giving a definite proof of the existence of exact 
class D eigenfunctions, for energies above the lowest ionization potential. 
It does show, howewer, that there are two possible ways in which a 
bounded solution of the wave equation with an energy in the interval 
En^ < ^ < 0 can behave in the region where r 2 and ri 2 are large, leaving 
it an open question whether the boundary conditions for r 2 and ri 2 small 
car; l e met independently by wave functions having these two types of 
behavior, or whether wave functions which satisfy the above mentioned 
‘4nner^^ boundary conditions must always reduce at infinity to linear 
combinations of the two types. Theoretical considerations can be 
urged against the former alternative but are not sufficiently conclusive 
to warrant reproduction here. Granting the validity of the second 
alternative on the basis of experiment, we may properly reinterpret the 
original argument for an overlapping of the discrete and continuous 
spectra as theoretical evidence of the existence of weakly quantized states 
in the energy region En'^ < E < 0. 

32i. The Discrete Eigenfunctions of the Differential Equation as 
Minimizing Functions. — In Sec. 32e we saw that the general atomic 
eigenvalue-eigenfunction problem can be reduced to variational form; 
in Sec. 32/ we proved that the quantity Q[\l/]/N[il/] to be varied has a 
lower bound ; and in Secs. S2g and 32A we satisfied ourselves, for the 
special case of the helium problem, that solutions of the Schrodinger 
equation which conform to the class D boundary conditions exist in 
the neighborhood of each singular domain provided only that the 
energy is less llia.i the energy of the normal state of the ionized atom, 
Un'^. If it can be proved that is greater than Eo^^\ we shall know 
tliat all the necessary conditions for the existence of a function of class D, 
which actually minimizes Q/N, arc satisfied. Actually Eo^^l and Eif'^ 
are readily worked out in this case, each being derivable from the solution 
of a hydrogeiiic-atom problem. We find that « •--4:Rh, and 
jg^(O) sr -SRh < Ei^'^y where R is the Rydberg constant. 

At this point we cross the Rubicon by assuming, not only for helium, 
but for every stable atomic or molecular system of particles, that there 
exists a class I) function which minimizes Q/N and gives it a value Eo 
which we identify with the energy of the normal state of the system. We 
further assume that in general there exists a sequence of class D functions 
^ 0 , ^it • having the property that each minimizes Q/N subject 

to the class D boundary-continuity conditions and to the additional 
restriction that all admissible comparison functions shall be orthogonal 
to each earlier function of the sequence. Since each set of comparison 
functions is more restricted than the set used in the preceding problem, 
the eigenvalues , En, • • * conform to the inequality 

En Em^i. If two or more of the eigenvalues are equal, we have a 
degenerate energy level, (The numbering of the eigenvalues, it will 



Sec. 32 ] PROPERTIES OF SOLUTIONS OF MANY-PARTICLE PROBLEM 215 

be observed, corresponds to that of an ordered sequence of minimizing 
functions and not to the number of distinct energy values.) By Sec. 32e 
the eigenvalues and eigenfunctions obtained in this way are the eigen- 
values and eigenfunctions of the appropriate Schrodinger equation, 
(32*1). From a theorem on p. 206 we know that eigenfunctions belonging 
to differerit discrete eigenvalues must always be orthogonal. However, 
when they are derived in this way^ independent eigenfunctions having the 
same eigenvalue are also orthogonal. On the other hand, since any 
arbitrary linear combination of a set of degenerate eigenfunctions of the 
Schrodinger equation is an eigenfunction of the same eigenvalue, the 
orthogonality property is not a necessary characteristic of an arbitrary 
set of degenerate eigenfunctions. 

32j. The Continuous Spectrum and the Completeness of the System 
of Eigenfunctions. — If the series of discrete eigenvalues were infinite in 
every case, and if we could assume lim Ep = the argument of Sec. 25 

n— > « 

could be used to prove the completeness of the s6H|uence ^o, ^i, * * * , 

* * * and our right to expand an arbitrary class D function into an 
infinite series of its m^mibers. However, as the discrete eigenvalues are 
negative and, in fact, have been assumed to lie below the class D 
eigenfunctions do not form a complete set. In order to set up an expan- 
sion theorem it is necessary to make use of solutions of the Schrodinger 
equation which conform to the class D conditions at finite points but 
which are not quadratically integrable in the neighborhood of infinity. 
We assume the existence of such type B eigenfunctions for all energies 
above the energy of the normal state of the once-ionized system. We 
infer from Sec. S2g that in portions of coordinate space, where one 
particle has a negligible mutual potential energy with the rest of the 
system, each of these type B functions can be factored into the product 
of a positive-energy type-B eigenfunction of the api)ropriate free-particle 
problem and an eigenfunction of the Schrodinger equation for a system 
composed of the remaining particles. Thus the type B eigenfunctions 
can be said to describe dissociated or ionized states of tht> system, although 
in special cases an integral combination of such functions can describe a 
weakly quantized state in which the system is not dissociated or ionized. 

In order to justify the assumption of an expansion theorem bringing 
in the continuous spectrum we again resort to a modified problem which 
has no continuous spectrum but of which the actual problem can be 
regarded as the limit. Let Fo denote the original problem of Eq. (32*1) 
and let Fi(<r) denote a continuous one-parameter set of modified problems 
differing from Fo only in that the fundamental region of coordinate 
space over which its solutions are spread is bounded externally by the 
surfaces 

rx = <r, « <r, fs = cr, * * * > (32'28) 



216 CONTINUOUS SPECTRUM; M AN Y-^P ARTICLE PROBLEM [Chap. VI 


on which \p is required to vanish. This problem has a system of discrete 
orthogonal eigenfunctions which are readily proved complete.^ 

If the radius <r is now allowed to become infinite, the problem Fi{<r) 
approaches Fo as a limit. Hence we can infer the properties of the latter 
spectrum from those of the former as in Sec. 30. The essential assump- 
tions here are those which relate to orthogonality and completeness. 

Both the discrete and continuous-spectrum eigenfunctions of Fo 
will be degenerate as in the simphi special case of the hydrogen atom. 
Introducing a subscript k to differentiate between different orthogonal 
eigenfunctions of a givciii eigenvalue, we designate the discrete eigen- 
functions by \kkn{x) and the type B functions by ypk{E^x). We assume 
the quadratic integrability of the eigendifferentials 

A, -.A* = (32-24) 

and adopt the normalization rules 

U'k.,^kn) = 1; -(A,f*,A,-ft) = 1. (32-25) 

The orthogonality of the }l/knB was proved in Sec. 32h. Comparison 
with the Fi problem suggests the additional orthogonality properties 

(Aj^pkyrpk'n) = 0, ) 

{Aji/kAi^k') = 0, iik 7^ k\ > (32-26) 

(Ajil/kyAi^k) == 0, if Ej + rjj < El or Ei + 7]i< Ej,) 

Finally we postulate the validity of the completeness relation 

“ ^^^^Cknhkn * + X( ( 32 - 27 ) 

k n 

for any pair of quadratically integrable functions / and g with Fourier 
coefficients c^n, Ck{E) and hk^ bk{E), respectively, defined as in (30-43) 
and (30-44). 

1 The essential feature of the proof is to show that if we denote by tpo, - • - the 
successive normalized eigenfunctions of a sequence obtained by minimizing Q/N with 
orthogonality conditions similar to those used in Sec. ^2h and by Eo(ff), Ei(<r), • ■ • 
the corresponding eigenvalues, 

lim En{<r) = 00. 
n—* 00 

This can be done with the aid of a second modification of the problem, F 2 (<^), in which 
the mutual repulsions of the electrons are omitted and the variables separated. Each 
energy level of the Fi(ct) problem lies above the corresponding level of the F 2 (<r) 
problem, and the spectrum of the latter problem can be worked out explicitly and thus 
shown to extend to infinity. The maximum-minimum principle stated in Courant- 
Hilbert, pp. 352-353, is useful in showing that the energy levels of Fi 

lie above those of F^ (t^. also Courant-Hilberty 3/.M.P., Satz 7, seite 367), 



Sec. 32] PROPERTIES OF SOLUTIONS OF MANY-P ARTICLE PROBLEM 217 


If the function f{x) conforms to the class D conditions, Hf will be 
quadraticaily integrable, and in analogy with the Weyl theory we may 
suppose that f(x) is not only root-mean-square expansible, as indicated 
by the completeness relation, but actually given point by point by the 
uniformly convergent series 

Six) = + Xf (32-28) 

We postulate that the eigendifferentials belong to the class of func- 
tions with respect to which the Hamiltonian is Hermitian. The legiti- 
macy of a term-by-term application of the operator H to the series (32*28) 
is a consequence of this assumption. To prove the property, we note 
that as Hf belongs to class D we can expand it like/(x) itself. Let the 
coefficients be hjcn and hk{E). Then 

hun = (///,^.n) = ^ ; (32*29) 

= lim {Hf, r^''ME',x)dE') = lim (/, ^;^''E'^fu{E' ,x)dE') 

V -*0 71-70 

= Eck{E), (32-30) 

As these arc exactly the coefficients we should obtain by applying H 
term by term to the series (32-28), the proposition is proved. The 
hypothesis on which it is based is supported by comparison of the 
and Fi(<r) problems. 

32k. Degeneracy. — In the relation lim jEn(o') = oo of footnote 1, 

n— ♦ 80 

p. 216, the index n is the ordinal number of the corresponding member 
of the complete sequence of minimizing functions ^o, ^i, • * * , v?n, - - • . 
It follows that the number of linearly independent eigenfunctions for all 
energies below any given energy is finite. Hence each individual level 
of the problem must have at most a finite number of linearly 

independent eigenfunctions. In other words every level has a finite 
multiplicity or degeneracy. The question now arises. Does the mul- 
tiplicity of a discrete energy level remain finite when we pass from the 
problem Fi{<t) to Fa by allowing a to become infinite? To show that the 
answer is affirmative, we make the contrary assumption that one or 
more of the discrete eigenvalues of the Fa problem have an infinite mul- 
tiplicity. Then, since every discrete eigenfunction of Fa is the limit 
of a corresponding eigenfunction of Fi(<r), it is evident that all eigenvalues 
En{<T) for which n is greater than some finite value AT, must approach the 
lowest infinitely degenerate level of Fa as a limit when <t becomes infinite. 
It is further required that the complete spectrum of ^ 1 ( 0 -) for large values 
of <r shall approach the complete spectrum of Fa in such fashion that the 
spacing of the levels of Fi(cr) shall become very small in the neighborhood 



218 CONTINUOUS SPECTRUM; MANY-PARTICLE PROBLEM [Chap. VI 


of the eigenvalues of Fo but very large in regions where Fa has an empty- 
spectrum. If one considers the behavior of the spacing of a group of 
eigenvalues of Fi{<i) as a function of <r in the light of these requirements, 
it becomes evident that the hypotheses are incompatible with the con- 
tinuity of the functions —Eniff). Hence the discrete energy levels of the 
many-particle problem have at most a finite degeneracy. 



CHAPTER VII 


DYNAMICAL VARIABLES AND OPERATORS 

83. THE MEAN VALUES OF THE CARTESIAN COORDINATES AND 
CONJUGATE LINEAR MOMENTA 

33a. The Statistical Mean Values of the Coordinates. — The entire 
theory developed in this book is based on the interpretation of as 
probability density in the configuration space of the Cartesian positional 
coordinates Xi, ^ xzn- In order to give this interpretation 

operational meaning (c/. footnote 1, p. 52) it is necessary to assume 
that the configurations of atomic systems are in principle measurable 
with any desired finite precision. This hypothesis is open to criticism 
on two counts. In the first place it overlooks the impossibility of dis- 
tinguishing between different electrons, different protons, etc. In the 
second place it overlooks the relativistic difficulty^ in locating the position 
of an electron with an uncertainty less than the Compton wave length 
h/tAoc. The first of these objections is reme^diable by a suitable modifica- 
tion of the theory discussed in Sec. 426. The second is one which has 
not been remedied in an entirely satisfactory manner as yet. It is 
apparently of slight importance so long as we restrict the application of 
the theory to domains in which the local wave length is everywhere large 
compared with the Compton wave length. In practice this means that 
we can apply the theory with confidence to extranuclear problems 
which do not involve energies for individual photons or electrons of more 
than, say, 100,000 electron-volts. 

Setting aside both of these objections for the present, we give precise 
meaning to the phrase ^‘probability density^^ in the following manner. 
Let us suppose that exact measurements of configuration are carried 
out on the individual members of an assemblage of N identical inde- 
pendent atomic systems so prepared that, at the time of measurement 
all are in a common subjective state described by the normalized wave 
function ^(a;,<o) = Then, if N is large enough so that the number 

of systems dN found to have configurations in the element dr of con- 
figuration space is itself large, we postulate that dN/N is equal to \^\^dT 
with an error which will nearly always be small compared with unity 
and which can be neglected in practice. 

Let q denote any function of the basic coordinates a;i, * * • , 

As the value of g is fully determined by a configuration measurement, we 

^ C/, L, Landau and R. Pbibrls, ZeiU.f* Physik 69, 56 (1931). 

219 



220 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


can work out the probability of any range dq of q values, say q' < q < 
by integrating over that portion of configuration space for which q 

is in the range dq. Thus our hypothesis regarding implicitly fixes 
the distribution function for measured values of q made on an assemblage 
of systems *in the state Moreover, it fixes the mean value ^ of any 
such function g(rci, • • • , Xsn) for a sufficiently large number of measure- 
ments as 

q = (33-1) 

This type of statistical mean value is often referred to as the expectation 
value of the quantity in question. Such mean values were used in 
Sec. 13 in deriving Newton's second law of motion for wave packets. 

In order to compute the mean values of other physical quantities, 
involving the measurement of velocities as well as position, a more 
difficult computation is generally necessary, but with the aid of suitably 
defined operators it is possible to set up formulas for mean values which 
are very similar to (33*1). The way in which this is done will be illus- 
trated in Sec. 336. 

33b. The Linear Momentum Operator. — In Chap. II, Sec. 15, we 
saw that, by Fourier analysis of ^{x,y,z,t), we can derive a probability 
amplitude ^{pxjPvyPzft) for the linear momentum of a particle, or 
system of particles, such that ^^"^dp^dpydpe gives the probability that 
the momentum vector terminates in the volume element dp^/dpt^pz 
of momentum space. With the aid of ^ we can compute the mean value 
of any component of the linear momentum, say p*, by 

v-ns. ^*p^'idp^Pydp^ = (33-2) 

An alternative mode of averaging, which does not involve the evalua- 

h d 

tion of the ^ function, can be derived by means of the operator ^ — • 

When applied to a wave function corresponding to a unique value of 
and hence of the form ^ this operator yields the 

relation 

Hence, differentiating the general formula of Eq 
we obtain 

^ As emphasized by von Neumann [Gottinger Nachrichteuy Math.-phys. Klasscy 248 
(1928)], a knowledge of the mean values of all functions of the coordinates is equiva- 
lent to a knowledge of their distribution function. In other words, the validity of 
(33-1) for every q is a necessary and suflScient condition for the validity of the state- 
ment that is a probability density in configuration space. Hence (33*1) is some- 
times taken as the basic postulate rather than our assumption regarding |^|2. 


(33-3) 

. (15*9) with respect to x, 



Sec. 33] 


MEAN VALUES OF COORDINATES AND MOMENTA 


221 


It will be observed that the right-hand member of the above equation 
gives the Fourier analysis of the function g—* ^ into plane harmonic 

waves. In other words, px^ is the Fourier transform of just as 

27rz ox 

is the Fourier transform of But by a known theorem of Fourier 
analysis' the scalar product of two quadratically integrable functions is 
equal to the scalar product of their Fourier transforms. 

Hence 




or 


- Qd 5F’*) l^dxdyd,. (33-6) 

Thus the mean value of each of the components of linear momentum 
can be evaluated by a rule formally the same as that used for /(g), but 

h/ B 

with the substitution of the operator — for the momentum component 

to be averaged. We might say that the average value of the momentum 

Px in momentum space is equal to the “average value of the corresponding 

. h d , „ 

operator — m x,yjZ space. 

Zttz ox 

In the same way we can formally determine the mean value of any 
positive power of p by the formula 


From Eqs. (33*6) and (5-5) it follows that the mean valu(i of the square of 
the total linear momentum for a single-energy wave function in three 
dimensions is equal to the mean value of the square of the classical local 
momentum. Thus 

P = ^ + ^ + = 2^iE - T)". (337) 

Using this definition of the statistical mean value of a eomponent of 
linear momentum we observe that Eqs. (13-8) and (13-9) are equivalent to 

d£ _ — dp* ^ 

^dt ~ dt dx‘ 

» Cf. footnote 1, p. 36; also Eqs. (30 29) and (32-27). 


(33-8) 



222 DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


As Sommerfeld/ for instance, has pointed out, the use of the operator 
h d 

for the momentum pk conjugate to the coordinate Qk is intimately 

zin oQk 

related to a fundamental theorem derived by Schrodinger^ and inter- 
preted initially as a statement of the law of the conservation of electricity. 
In its simple form for a single charged particle in three dimensions, using 
Cartesian coordinates, this theorem [c/. Eqs. (8*5) and (8*6)] is 

div grad - '*'* grad 'I' + (33-9) 


Here Ct is the vector potential of an assumed external electromagnetic 
field. If we apply this theorem to the wave function for an electron of 
charge e, we may interpret as the statistical mean charge density. 
Equation (33-9) then takes the form of the equation of continuity in 
hydrodynamics with the vector density of electric current defined by® 

grad grad 'ir* - = ^7. (3310) 


Multiplication by gives the mass current density /, and integra- 
tion over all space should give the product of the mass fjt into the 
average vector velocity. Therefore, if we define the momentum in an 
electromagnetic field by Eq. (16*24), we obtain for its average x com- 
ponent, for example, the value 


- + j®- - ssj.[* te - *17 r- 


h d 

But the operator — is clearly Hermitian with respect to configuration 


space and the class of all continuous and piece-by-piece differentiable 
functions which approach ^sero at infinity. This includes class D. Hence 
Eq. (33*11) is equivalent to (33*5), and the above discussion yields a 
new derivation of the latter formula independent of the Fourier integral 
theorem. 


h d 

Another important application of the operator — is the general 

proof of the Heisenberg inequality ApAq ^ for linear momentum 
(c/. Chap. II, Sec. 16). To establish this inequality we first transform 
the formula for (Apk)^ given in (16*1) to 


(ip.)*. mi2) 

^ SoMidpBRFiDLD, Atombau und SpektraUinien, Wdtmmechaniach&r Ergdnzungabandf 
pp. 284-285, Braunschweig, 1929. 

* £. ScmiaDmQBR, Ann. d. Physik (4) 81, 136, (1926). 

* C/. CoNPON and Moesb, Q.Af p. 28. 



Sec. 33] MEAN VALUES OF COORDINATES AND MOMENTA 


223 


by means of (33*6). Following WeyP we use a slight generalization of the 
Schwarz inequality [c/. Eqs. (22* 10) and (22*20)] which states that if 
/i, gi, / 2 , 92 are any four quadratically integrable functions, their scalar 
products are subject to the inequality 

[(/i, /i) + (J2, f2)][(gh Qi) + (ff2t 92)] ^ ICfi, 9i) + (Ut ^ 2 ) 1 ^. (33*13) 

Making the identifications 


gi = ( 9 * - Qk)-^ = -gt*, 


h d 

and using the Hermitian character of the differential operator and 

the multiplication operator [^aiX] with respect to coordinate space and 
class D functions, we readily deduce the relations 


(fu fi) + (f2, U) = 2j|(A ^ I dr = 2£kVk\ (33-14) 

( 3 u gi) + (gi, gi) = (33-15) 

when 'i' is of class D. Similarly, 


The integrand of the right-hand* member of (33*16) reduces at once to 

kici'* h 

— j— ~ [c/. Eq. (37*5)] and the integral itself reduces to —o— r. Corn- 
et 

bining this result with (33-14), (33-15), and (33-13), we obtain 





(33-17) 


which is the square of the desired Heisenberg inequality. 


The machinery at hand permits a useful generalization of our previous treatment 
of the variation of Aqk^ in time. For our purposes it will suffice to consider the free 
particle for which Ap** is constant. Weshall carry out the calculation for the one- 
dimensional case, the extension to three dimensions being obvious. 

The momentum probability amplitude is given by 





2iriqp 

* dq. 


Hence 


h 

2«rt dp 




/-■ 


qA^e 


2iriqp 

h dq. 


^ Hebmank Wetl, The Theory of Groups and Quantum MechameSf either 1st or 
2d ed., Appendix I. For other proofs see W. Pauli, Jr., in Geiger and ScheeFs Hand- 
buck der Physikf XXIV/1, 2d ed., p. 102, Berlin, 1933; also W. Heisenberg, The Phys- 
ical Principles of the Quantum Theory^ pp. 16-19, Chicago, 1930. 



224 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


By Plancherel’s theorem {cf. footnote 3, p. 36), 


^][ -2S 


Now for the free particle 


tirp»(<--<o) 

^(p,<) == C ^0 = 4’(p,^o); 


.A. ^ zJAq, + -X ff?]. 

27rt dp /i L 2irt dp J 






+ 5?).-v 


Since = ^2 — and Ap^ - p* — p^, we have 


(pg + 9P - 2pj), 


(3318) 


It is readily verified that (pq + ^ — 2pg')<-<p vanishes for the special cases of Secs. 9 
and 16, It must do this in the latter case since the presence of a term linear in the 
time in that case would lead to a violation of the uncertainty principle. In any case 
Ag* has a minimum at some time so that by choosing = t' we have simply 


(A72), = ( Af ), + 


T- ^c)* 


For further discussion, see W. Pauli, in Geiger and ScheePs Handbuch der Physikj 
XXIV /I, 2d cd., p. 100, Berlin, 1933. Here the term linear in t is interpreted in terms 
of the probability current density. 

34. THE ANGULAR-MOMENTUM OPERATORS 

34a« Definition of Operators. — In the classical mechanics the angular 
momentum of a system of particles is defined by the formula 




(34-1) 


where pk is the linear momentum of the Jkth particle and r* is its distance 
from the origin. With the aid of the corresponding classical expressions 

for the components of £, it is easy to invent operators which have the 



Sec. 34] 


THE ANGULAR-MOMENTUM OPERATORS 


225 


Jl d 

same relation to the components of angular momentum as — has 

2ti ox 

to the X component of linear momentum. 

Thus, in the case of a single particle where = x'py — t/p*, we replace 
Vv by the corresi)onding operators to obtain an operator for which 
we designate for the present as (£^)op 


(«^*)op 


h I b d\ 

27r2\ dp ^d:c/ 


(34-2) 


In harmony with Eq. (33-3) let us adopt the convention that a particle 
whose state is described by ^ shall be said to have a unique value of 
say , if the ai)plication of the o])erator to ^ is equivalent to 

multiplication by £>/. In other words, has the unique value if ^ 
is a solution of the differential equation 




■ s.(*, 


dy 






(34*3) 


Otherwise we assign no definite value of to the particle under con- 
sideration but assume that it has a certain probability of taking on any 
one of a number of values, like the variety of values of p®, p*,, pz in the 
case of a wave packet. As a means for determining the probability 
amplitude for <£2 in such cases we adopt the scheme of analyzing the 
wave function into a linear combination of functions which satisfy 
Eq. (34*3) together with suitable boundary and continuity conditions 
for some value of the parameter <£/. Thus Eq. (34'3) is made the basis 
of an eigenvalue-eigenfunction problem like the Schrodinger equation 
(6*9). The eigenvalues of (34-3) are the possible values of £>z and the 
eigenfunctions describe states having definite values of 

34b, Hermitian Character of Angular-momentum Operators. — By 
direct application of Gauss\s transformation the operator (£a)oi> is readily 
proved to be Hermitian with respec.t to the class of all quadratically 
integrable functions which have quadratically integrable transforms by 
(<£«)op. Its Hermitian character can also be deduced from the Hermitian 
h d Ji b 

character of jc— . ^ and ^ by means of the relations 

2« dx 2n ay 

t/'’ *■) ■ Qd i*- *^0 ■ G" i*)’ 

Gsi k*'’ *’) ■ I*" ■ G" s*’)' 


(£,)op and (£»)op can be treated in the same way. 

Since these operators are Hermitian, their quadratically integrable 
eigenfunctions have real eigenvalues. The eigenfunctions of this type 



226 


DYNAMICAL VARIABLE!^ AND OPERATORS [Chap. VII 


for different eigenvalues are mutually orthogonal. Since the existence 
of a continuous spectrum of eigenvalues necessarily involves the existence 
of a continuous array of eigenfunctions depending on the eigenvalue 
parameter, we can readily prove that quadratically integrable eigenfunc- 
tions of an Hermitian operator 0 cannot be associated with a continuous 
spectrum. For if such a continuous array of eigenfunctions, say ^(X,x), 
did exist, the scalar product (^(X,.t), ^(X',a:)) would be a continuous 
function of X and X' which vanishes when \ 9^ X' but does not vanish 
when X = X' — an obvious impossibility.^ 

34c. The Expansion Theorem. — In order to justify our definition of 
the angular-momentum o[)erators we must develop an expansion theorem 
for their eigemfunctions. In this particular case the simplest procedure 
is to begin by finding the actual form of the eigenfunctions in spherical 
coordinates. The expansion theorem then turns out to be a simple 
Fourier expansion. 

Direct transformation to the coordinate's r, dy (p of Sec. 28 carries the 


operator x~ — 1 /™ over into and Eq. (34*3) bee^omes 


It has the solutions 


2 « ‘ 


2‘in£iz'<p 

^ = x(rfO,i)e ^ . 


(34-4) 


(34-5) 


By the reasoning of Sec. 28 ^ must have the period 2w in the argument <p 
if ^ is to be a continuous single-valued function of x, yy z, Hen(;e the 
quantity 27r£zV^ restricted to integral eigenvalues. 




mh 


w = 0, ±1, ±2, 


(34'6) 


The eigenfunctions can all be made quadratically integrable by suitable 
choice of the arbitrary factor x- In accordance 'with the general theorem 
stated above (Sec. 346), we have a discrete spectrum only for the physi- 
cal quantity^’ From our present point of view nonintegral values 
of 2ir£,glh have no meaning. 

Since an arbitrary single-valued continuous function of the Cartesian 
coordinates transforms in spherical coordinates into a function with the 
period 2ir in the azimuthal angle any class A or class D wave function 
for a single particle can be analyzed into the complex Fourier^s series 
of eigenfunctions 

+ «0 

^ Cf. J. F^nkel, WeUmme^nikn Chap. Ill, Sec. 3, p. 152. 


(34-7) 



227 


SBC. 34] THE ANGULAR-MOMENTUM OPERATORS 


The coefficients Xm{r,6,t) are determined by the usual formula 

1 

*- ■ ajo 

and satisfy the completeness relation 

-f- oo 

= 2jr XmXm* (34-8) 

fn= — 00 

[cf. Eq. (25*11)]. It follows that if ^ is normalized, 

27r X m; sin 0 = ///. i^l^dxdydz = 1. (34-9) 

rn = — 00 

The contribution of the eigenfunction to the integrated inten- 

sity of the wave aystem is 27rffr^ sin S jxml/^drdO. We infer that the series 
of functions play the role of probability amplitude for the 

set of independent variables r, B, £/ (c/. Sec*. 15, p. 63). 

By shifting the axes of our spherical coordinates we obtain the 
eigenfunctions for and £y. Their eigenvalues are of course the same 
as those of 

34d. Angular Momentum of a System of Particles. — In the general 
case of a system of n — / + 1 particles we define the operators for the 
three components of angular momentum with respect to the origin of an 
absolute coordinate system as 

/+i 

/+1 

k^\ 

/+! 

(£.)op = 

k^l 


— Zk- 


- Xk; 


“ Vk 


dZk/ 

dXk) 


) 

:) 


(34*10) 


These operators are also Hermitian with respect to coordinate space and 
the class of single-valued differentiable functions which are quadratically 
integrable and have quadratically integrable transforms. 

In order to resolve them into constituents corresponding to the 
internal angular momentum, and to the angular momentum of the center 
of gravity with respect to the origin, we make the change of variables 
given in Eqs. (15*13) and (15*14). The expression for (<£,)op becomes 

f 





(34*11) 



228 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


We define the operator for the z component of the internal angular 

/ 

d 


momentum to be 


h d Since the same form of 

k^l 


operator is used for a given component of the internal angular momentum 
of a system of / + 1 particles and for the absolute angular momentum 
of a system of / particles, the eigenvalues for the two cases must be the 
same and the eigenfunctions must have the same form. 

In order to determine the spectrum of (ce«)op we revert to the last of the 
formulas (34-10) and replace the Cartesian coordinates of the various 
particles by spherical coordinates. (£>z)ov is reduced to the form 


/ 



*•=1 


(34-12) 


Introducing the relative azimuthal coordinates^ 


tp = ^1, ~ ^1, Ofs = ^3 ■— ^1, * * • af ^ iPf — ^1, 

we finally obtain 

- a i- 

Thus the equation defining the eigenftmetions and eigenvalues of (£z)„p 
is identical in form with that used for a single particle. The eigenvalues 
are again the integral multiples of h/2Tr and the eigenfunctions are 

= XJf(^i>^i,^2,^2,«2, * • • M = 0, ±1, ±2, • • • (34*14) 


Here we adopt the usual practice of spectroscopists in using a capital 
M as the quantum number associated with £, when referring to the 
resultant angular momentum of several particles. 

Comparing (34-14) with (28-10) we see that the factorable solutions 
of the two-particle problem with the center of gravity eliminated are eigen- 
functions of (je«)op provided that the factor has the exponential 
form 6*^^. The trigonometric forms of Eq. (28-10) are linear combina- 
tions of and and as such have definite values of In fact 

( h 

and requiring that the application of this operator to ^ shall yield a 
multiple of 

^ The coordinate system 

tp ^ j 2^ ifihi a2 ^ <p% — *p] 

will do as well and is more symmetrical. 


<*8 « ~ 


af ^ ipf ^ V>/-i 



Sec. 34] 


THE ANGULAR-MOMENTUM OPERATORS 


229 


The expansion theorem of (34*7) holds equally well for the many- 
particle problem and for the single-particle problem except that the 
coefficients in the former case depend on a larger group of independent 
variables. The contribution of the eigenfunction correlated with M to 
the integrated intensity of the wave system is 

/ / 

2ir XMXM*Wrk^ sin dr-irfeijjdaz. 

We interpret this contribution as the probability that an appropriate 
measurement of £2 will yield the value Mh/2T. This statement pre-. 
scribes in a general way the experimental procedure which must be used 
in determining the distribution of Juz values for any given assemblage 
of systems, viz.^ the experimental arrangement must be such as to resolve 
the primary assemblage into subassemblages each of which has a wave 
function of the form In the Stern-Gerlach experiment, which 

measures both atomic magnetic moments and £2 valu(\s, this resolution 
is accomplished by sending a beam of atoms through an inhomogeneous 
magnetic field directed along the z axis in wliicdj the path of tlu^ atoms 
in each subassemblage is different from that of atoms in any other 
subassemblage. 

34e. Mean Values. — To get tlu^ mean, or expectation, value of 
£2 for any given wave function we use the same procedure as for one of the 
components of linear momentum. 


2 / • • • * sin dkdrkidekYl_dai 


k 


(34-15) 


With the aid of the above expression we can show tliat tln^ definition of £« 
given by Eq. (34*3) is equivalent to the classical definition in the limiting 
case of a sharply defined wave packet. For such a packet, where the 
range of values of the Cartesian coordinates and momenta is small 
compared with the corresponding mean values,^ 

1 To prove Eq. (34 16) we note that, if ^ is appreciably different from zero only 
in the immediate neighborhood of the point x, y, z = x, y, 5, we can substitute i, ^ for 
n in Eq. (34-15) without appreciable error. Then 

“ X(«*/ • • ■ -v,f 



230 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


■e.- = 2 / • • • 

k k 

(34-16) 

Since the classical values of coordinates and momenta are the same as the 
mean values for the packet, this shows that our definition is equivalent to 
the classical one in the realm of validity of the older theory. 

34f. The Vector Angular Momentum and Its Square ; the Symmetric 
Top. — The possibility of assigning an exact value to £* and hence also 
to £>x or does not carry with it the possibility of assigning an exact 

value to the vector angular momentum ii, for there are no simultaneous 
nonvanishing solutions of the characteristic equations for the three 
components. On the other hand, unique values for the square of the 
total angular momentum are possible if we define this quantity by means 
of the natural operator 

(£2)op = [(i3.)op]2 + [(£jop]^ + f(ii.)opl‘^ 



(£^)op is readily proved Hermitian with respect to the class of single- 
valued differentiable functions which, together with their transforms by 
(i3*)op,(£jpp,(£*)op, [(£x)op]^[(£|/)op]^[(£*)op]^ are quadratically integrable. 

In our study of the eigenvalue-eigenfunction problem for (£0op we 
begin with the three-dimensional case where / is unity. Introducing 
spherical coordinates, applying the operator to 4', and requiring that the 
result shall be equal to if is unique, we obtain the differential 

equation 





1 ^ 
sin^0 dip^ 



(34-18) 


This is identical with the differential equation (28-6) whose solutions are 
tesseral harmonics and whose eigenvalues are given by (28-15). Thus 

(£*)' = Kl +■ l)(^y, Z = 0, 1, 2, 3 • • • (3419) 

confirming the interpretation of the quantity l(l + l)(h/2ir)* which was 
made in Sec> .28 (p. 151). 




Sec. 34] 


THE ANGULAR-MOMEHTUM OPERATORS 


231 


In order to deduce the eigenfunctions and eigenvalues of £2 ^ 

many-particle problem we introduce a set of coordinate axes x\ y', z' so 
chosen that the axis passes through the first particle and the x'z' 
plane through the second particle. The relative positions of the various 
particles are then fixed by the coordinates 2/, , xz\ 2 / 3 ', • • • , z/. 

The remaining coordinates are taken to be three Eulerian angles ^ 
which define the orientation of the x\ y\ z' axes with respect to a set 
of fixed axes Xj z. These angles are the angles of three successive 
rotations required to carry a set of movable coordinates f, rj, f from an 
initial orientation in coincidence with the x, y, z system to a final ori- 
entation in coincidence with the x', y\ z' system. The first rotation 
is made through an angle (p in the pos- 
itive sense (by the right-handed screw 
rule) about the z axis, tp is so chosen 
that the axis is carried into coinci- 
dence with the intersection of the x, y 
and x', !/' planes (the line of nodes NN N 
in P'ig. 14 ). There are two possible 
values of <p in the range 0 ^ ^ 27r 

which satisfy this condition, but we 
resolve the ambiguity by giving <p a 
value less than r if, and only if, the 

projection of the positive z' axis on the — 

x,y plane makes an acute angle with angles, 

the positive y axis. Then by applying a second rotation to the f, rj^ f axes 
through a suitable angle 6 in the positive sense about the second position 
of the 77 axis (line of nodes NN)^ we can bring the {,77 plane into coincidence 
with the x', 2/' plane and the f axis into coincidence with the z' axis. The 
above choice of the angle (p is such that 6 need never exceed the value t. 
A third rotation through a suitable angle ^ (0 ^ ^ ^ 2 w) about the 
2' axis brings the f,77,([' system into its final configuration. In terms of 
these angles the direction cosine.s relating the primed and unprimed sets 
of coordinates are as follows: 





cos (p cos $ Cos ^ — sin ^ sin ^ 

sin tp cos 0 cos ^ -f- (*.os tp sin ^ 

—sin 6 cos ^ 

—cos tp COS ^ sin ^ — sin <p cos ^ 

—sin ip cos 0 sin ^ 003 ip cos 

sin B sin V' 

COS <p sin B 

sin ip sin 0 

cos B 


(34-20) 


The expressions for the operators (JB,)op, (eCp)ap) (je,)ap now take the forms 





232 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


® ^ ‘^Te + SI ^)’ 

^ ® ^ + SI a^)’ 

(£«)„„ = ( 34 - 23 ) 

Equations (34*21) to (34*23) can be derived by direct transformation or by the 
following less tedious {‘considerations [c/. G. Breit, “Separation of Angles in the Two- 
electron Problem,” Phys. Rev. 36 , 569-578 (1930) j. bet F denote an arbitrary diflfer- 
entiable function of the generalized coordinates gi, • * • , q>t. A class of linear dif- 
ferential operators applicable to F is defined by the equation 

W(?., ^ • ■ ■ ’ 


The operator W depends on the s independent quantities vi^ ^ 2 , • • • , v* which may be 
arbitrary functions of the' coordinates ^i, • • • , The v*8 can be thought of as 
velocity components of a fluid motion in the configuration space Qi, • ' • , Qs. 

The velocity operators W have the important property that 

W{vx, •••,?;,)+ Tr(f;i', • • • , vj) = W{tH -f 4- 1 ,/). (6) 

Furthermore, if we wish to expn^ss F and TF in terms of a second set of generalized 
coordinates Qi, • • • , Q-, we have only to write 


F(?., •••,?.)= eCQi, • • • , 0-). WG = 


V, 


r =s 'Vm 

L d'Ui-o ^ ' 


(c) 


The linear-momentum operators are obviously complex multiples of special velocity 
operators corresponding to uniform translations along the x, y, and z axes, respectively. 

Similarly, ^{£x)opi ('^*)op arc special cases in which the velocity system 

is that of a uniform rotation of unit angular velocity about the a;, and z axes, 
respectively. Finally the operators d/d*pj d/d9 in the coordinate system 

tpj \f/y By Zi\ Xiy • • • , z/ arc velocity operators for rigid rotations of the system of 
particles about the z axis, the z' axis, and the line of npdes, resp(H)tively. 

We can now make use of a fundamental theorem regardinj; the compounding of 
angular velocities in order to express (£*)op, («Cy)op, (£*)op as linear combinations of the 

operators d/d^, d/dd and \)ice v&rsa. Let Ui and ih denote two angular veloci- 

ties with the resultant R. Let Vi and V 2 denote the vector linear velocities correlated 


with 111 and Then, according to this theorem, the linear velocity system V for 


the equivalent single angular velocity R is 


F » Fi 4- F 2 . 

It follows from Eq. (6) that the velocity operator for the angular velocity R is the sum 

— K — ♦ 

of the velocity operators for tli and Ba. Thus the velocity operators corresponding 



THE ANGULAR-MOMENTUM OPERATORS 


Sec. 34 ] 


233 


to different rigid rotations of the system of particles can be combined just like the 
corresponding angular-velocity vectors. 

Let the second position of the 17 axis in passing from the x^y^z orientation to the 
x'yy'yz' orientation be designated by 172. This has the direction of the angular velocity 
associated with d/dO. A unit vector in this direction is equal to the sum of vectors 
of length cos (a;, 172), cos (?/,772), cos (2,172) in the dinictions of the a:, y, and z axes, respec- 
tively. But the operator corresponding to an angular velocity of magnitude 
cos (a;,772) in the direction of the x axis is (‘iirt'/h) cos (a:,?72)(<£,)„p. Hence the operator 
equation corresponding to the nisolution of a vector in the direction of the 172 axis 
into vector components along the coordinate axes is 

h d 

= <*‘OS(x,ri 2 )(£,)ov + COS( 7 /, 172 )(i 3 joi> + COS (2,172) (iiz)op 
= -sin <P (£ar)op 4 - COS ^ (£„)op. 


A similar resolution of the. angular-velocity systems corresponding to the operators 
h d , h d . , I 


h d 

2irt d<p 

h d 


— (‘C«)oi 


4 cos 0 (<C2)op. 

Equations ( 34 * 21 ), ( 34 * 22 ), ( 34 * 23 ) are readilyobtained by inverting the above equations. 


= cos <p sin 0 (£*)op 4 sin <p sin 0 (£,y)o 

zirt d\f/ 


Squaring and adding these operators we obtain the following char- 
acteristic equation for to be solved subject to the continuity condition 
and the requirement of quadratic integrability: 


_ h^i( cos^e\dHJ 1 dW 2co^dd^U 

^ ^ 47rHV ^ sin2^ 

Here we indicate the wave function by the symbol U to avoid confusion 
with the Eulerian angle As the operator does not contain the 
relative coordinates 2 /, X 2 ^j • * • , 2 ^/ the eigenfunctions of are deter- 
mined by Eq. (34-24) only to a factor which is an arbitrary function 
of the prime^d coordinates. To solve Eq. (34-24) we set 

U = • • • z/) M, iV = 0, 41, 42, - - - (34-25) 

Substitution yields 

, cosOdP , r4«-W' (iV cos . 

This equation is a generalization of Eq. (28-8), to which it reduces if we 
set N equal to zero. It can be solved by the polynomial methods used in 
treating (28-8) and has been so treated by Sommerfeld.* Equation 
(34-24) is in fact essentially the wave equation for the free symmetric 

‘ SoMUBHFiiLD, Atombau und SpektraUinien, WellenmechaniKher Erg&mungtband, 

Kap. I, §110, Braunschweig, 1920. 



234 


DYNAMICAL VARIABLES AND OPERATORS (Chap. VII 


top.^ The eigenvalues of are the same as in the one-electron problem, 
m., L{L -f where L — 0, 1, 2 • * • . The minimum value of 

the azimuthal or angular-momentum quantum number L for any given 
values of M and N is the larger of the two numbers |M| and |A^|. It 
follows that for a given values of L, M can take on all values between 
—L and +L. The eigenvalue (£2)' = L{L -f has (2L + l)-fold 

degeneracy. 

Changing the independent variable from 6 to the quantity t defined by 
t = — cos 6) 

permits us to express the eigenfunctions in terms of Jacobi 

polynomials. Introducing the conventional notation for these poly- 
nomials and the definitions 

d = |M ~ ATI, s = |M + ATI, 2p = 2L - (d + s), 
we have 


d a 

PmM^) == tHl- + d + S, 1+ d, 0. (34-27) 


A summary of the more important properties of the Jacobi polynomials 
is given in Appendix J. 

Since the eigenvalues of range from zero to infinity, it is not 
difficult to show that the series of eigenfunctions for all possible values of 
L consistent with any given values of M and AT forms a complete system. 
As the exponential functions also form complete systems, it is possible to 
expand an arbitrary continuous function of spread out over the 

region 0^<^^2ir, 0^^^27r into a mean-square con- 

vergent series of the form 




4 " 00 


+ W 


S^<pM) = 2 2 S 


L>{ 


\M\ 
\N\ 


• « M ^ — to 


(34*28) 


36. THE ENERGY OPERATORS 

, 36a. Calculation of Probabilities and Mean Values of Energy. — The 

expansion (30*32) resolves an arbitrary normal wave function 
into a series-integral combination of eigenfunctions of the energy operator 
if. Each term of this expansion is a harmonic function of the time t 
with the frequency vn — En/h. In optics the square of the wave ampli- 
tude, which we may call the intensity of the wave system, measures the 
energy density. The integrated intensity of each monochromatic 
component gives the total energy of the corresponding frequency. If 
we divide this integrated intensity by the corresponding photon energy 
i <7/., e,g.j Condon and Monss, <?. Jf., pp. 74-77. 



Sec. 35] 


THE ENERGY OPERATORS 


235 


hvf we obtain the total number of photons of the given frequency in the 
assemblage described by the wave function. In our present theory the 
intensity of the waves, measures probability density directly, and by 
analogy we should expect that if we resolve ^ into monochromatic 
components and compute the integrated intensity of any of them, the 
result will measure the total probability of the corresponding frequency 
and energy. If we have an assemblage of systems in a definite state 
described by a normalized wave function ^ and multiply the integrated 
intensity of the waves of a given frequency by the total number of systems 
in the assemblage, we obtain the most probable number of systems having 
the corresponding energy. If the number so calculated is very large, 
we can identify the most probable number with the actual number found 
by measurement in any special case without serious likelihood of appreci- 
able error. This method of computing the probability of an energy 
value is generalized in Sec. 36 to apply to the probabilities of other 
‘^dynamical variables” and is subject to further scrutiny in Sec. 41. 

In order to reduce the above rule to a formula, we assume that the 
normalized wave function ^ is subjected to an expansion of the form of 
(32*28). By performing the expansion appropriate to the time t = 0 

^2inEt ' 

and multiplying each term by the appropriate time factor e * , we 
obtain an expansion suitable for an arbitrary time t [cf. Eqs. (30-36) and 
(30-42)]. Thus 


2friEkt 2niEt 

’4'(*i0 = * ■*" X J c„(E)^„(E,a:)e dE. (35-1) 

k n , n 


2in.Ekt 

The partial wave function for the discrete energy is '^Ckn^l'k^e * . 

n 

By (32-27) its integrated intensity is '^CknCk„*- If 4'* denotes the 

n 

normalized value of the above partial wave function, while Nk and N 
give, respectively, the number of systems of energy Ek and the total 
number, we have 

^ (35-2) 

n « 

Similarly the fraction of systems in the infinitesimal interval dE of the 
continuous range of energy values is ^/CniE)cn{E)*dEf assuming aj^ro- 

n 

priate normalization of the corresponding wave functions. Thus the 



236 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


set of discrete coefficients Ckn together with the functions Cn{E) play the 
part of probability amplitude for energy.^ 

The above procedure for calculating the probability of different 
energies is obviously similar in form to the scheme used for computing 
the probabilities of different linear momenta. In fact it can be derived 
from the latter in the special case of free particles, where the energy 
is a function of the momentum only, independent of position. In this 
case the analysis of ^ into plane waves is also an analysis into components 
of definite frequency and energy. This is evident from (15*8). The 
number of particles of energy intermediate between and E' + dE' 
is equal to the total number of particles with momenta p*, pz which 
give points in the spherical shell 

E' ^ ^ E' + dW 


in momentum space. By the analysis of Sec. 15, this number is equal to 
j l^l^dpxdpydpz, which is nothing more or less than the integrated 

snell 

intensity of the residual wave function in x,i/,z space when all harmonic 
components have been removed except those which satisfy the inequality 
E' ^ E ^ E' + dE\ 

It follows from (35-1) and Eqs. (32-29) and (32-30) that 


Hence 





h 

2Ti dt 


CknEk4^kv^ 


2inEki 

h 



+ Xpn(E)EME,x)e~~dE. 

n 


(35-3) 

(35-4) 


‘ It is easy to jump at the conclusion that ^|c*»vi'*n|*(iT is the probability that an 

n 

arbitrary system will simultaneously have the energy Ek and a configuration belonging 
to the volume element dr — ^a mistake made by Kemble and Hill, Rev. Mod. PhySf 2, 6 
(1930). It is true that it does give the fraction of the original number of systems 
remaining, if one first eliminates all energy values but Ek, and later eliminates all 
coordinate values outside dr, but the observation of the energy alters the position and 
vice versa, so that one cannot use the energy together with the space coordinates to 
form a single coordinate system. 

As an example of a proper coordinate system involving the energy we may 
mention the three quantities E, in the case of the two-particle problem. The 

exp^sion (30*35) resolves the wave function in this case into a linear combination of 
functions of all three quantities. The system of coefficients forms the probability 
amplitude for the coordinate system E, £», 



Sec. 35] 


THE ENERGY OPERATORS 


237 


Thus the equivalent operators 


^ bear the same relation to 

2ti at 


the numerical values of E as — bears to the numerical values of p*. 

2vi dx ^ 

(As stated on p. 26i^ classically —E can be regarded as the momentum 
conjugate to the time t when the latter is treated as a coordinate.) In the 
case of a sharply defined wave packet the argument of footnote 1, 
p. 229, can be used to prove that the mean energy derived from (35*4) 
is equal to the classical energy H{p,g). 

*36b. Transformation of Hamiltonian Operator. — The disclosure of 
the intimate relation between the Hamiltonian function of classical 
theory in Cartesian coordinates and the fundamental operator of the 
Schrodinger equation raises the question of the relation between these 
expressions in other coordinate systems.^ It is simplc'r to transform 
the classical Hamiltonian from one coordinate system to another than to 
make the corresponding direct transformation of the Hamiltonian opera- 
tor. Hence a rigorous method for pavssing from the classical function 


H(p,q) in generalized coordinates to the appropriate operator H 



would be useful. 

A natural conjecture regarding the answer to this question would be 


that we simply employ the substitution pk 


h d 


again. This rule is 


dqk 

sufficient to guarantee that H shall give the classical energy in the case 
of a well-defined wave packet, but does not always give the same result 
as direct transformation of the primary operator from the Cartesian 
system to the system of coordinates desired. Consider, for example, the 
kinetic term Pr^/2p of the classical Hamiltonian function for a single 
particle in spherical coordinates. So long as pr is a number, the factor 


Pr^can be replaced by either or ~pr^f{r) with/(r) arbitrary, 

but when we convert pr into an operator these various forms are no 
longer equivalent. Direct transformation of the Cartesian operator 
yields the radial term [cf, Eq. (28*3)] 


i!_ 11/^ A 

SttV r* dr\ dr)’ 


which is obtainable by our simple substitution rule, only if we start from 
the special classical form r~^prT^Pr- We are thus driven back on a direct 
transformation of the Hamiltonian operator, or, what comes to the 
same thing, of the Laplacian operator, from Cartesian to generalized 
coordinates. 

i Cf. E. ScHB^^DiNGER, Ann. d. Physik (4) 79 , 747-748 (1926); Courant-Hilbert^ 
M,M.P.y §8, pp. 192-195; B. Podolsky, Phys. Ren. 32, 812 (1928). 



238 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


This transformation is sometimes facilitated by the use of formulas 
derivable from the variation principle (or otherwise). Let us first 
introduce a weighted set of Cartesian coordinates Xi, X 2 , * • • , Xx 
defined by the equations 

Xl = X 2 = Xu = Xa = fJL2^X2j * • * . 


These transform the operator into the X-dimensional Laplacut 


Let the final set of generalized coordinates be designated by the symbols 
Qh • • • ,Q\* Let the symbols and denote the quantities 


2 dqt dqk ’ 


dXr dXr 


dXr dXr 


(35*5) 


T=1 r«l 

respectively. Let g denote the determinant 



9^11 

9i2 • 

’ • S^ix 

g = 

921 

* 

’ * 92X 


9x1 

9 x 2 * 

' * 9xx\ 


is the functional determinant, or Jacobian, of the coordinates 
Xl, • • • Xx with respect to the system gi, • • • gx). Then the \-dimen- 
sional Laplacian of ^ in the q system is 


% h 

and the Schrodinger equation in the q system is 

i h 

It is convenient to designate the wave function of Eq. (35*7) by the 
syihbol ^jsr to indicate the fact that its normalization condition is by 
hypothesis 

Jjx^x*dXx • • • dXx = 1. (35-8) 


can be expressed either in terms of the original Cartesian coordinates, 
or in terms of the q's, but, whereas \4ix\^dX\ • • • gives the proba- 



Sec. 35] 


THE ENERGY OPERATORS 


239 


bility of the element dXidX 2 • * * dX\ of the X coordinate space, we 
must multiply \\px\^ by the Jacobian in order to obtain the probability 
density in q space. Thus the normalization integral for \f/x in g space 
has the form f4'xi^x*g^^dqi • • • dq\. This extra factor g'A jg sometimes 
called the density factor for ypx in q space. 

Let us now introduce a new wave function defined by the formula 

= gHx^ (35*9) 

The normalization condition for \f/q is 

S^,yp,*dqi - • dq^ = • • • dXx = 1. (35-10) 

Thus \l/y has the density factor unity in q space, whereas \px has the d(^nsity 
factor unity in X space. The substitution of g~ ^'*\l/q for yj/x in Eq. (35-7) 
gives the following differential equation for yl/qi 


Hv'f'q = 


i k 

dQidqk dqidqn r dq,Y ^dq,ri 


+ (35-11) 


Example . — As an illustration of the application of the above formulas 
let us consider the transformation of the wave equation for thci three-parti- 
cle problem (the coordinates of the center of gravity eliminated) from the 
Cartesian coordinates of Eq. (32-17) to the spherical coordinates of 
Eq. (32-18). We reletter the coordinates as follows: 

Xi = fi, X\ = {2, qi = ^4 = ^2, 

X 2 = vi} Xi = 712, q2 = a, Qb = <pi, 

X3 = fi, Xs = f 2 , qs = ^it Qb = ^2. 

The transformation equations are then 

Xi = qi cos q 2 sin 73 cos qs, X4 = qi sin q 2 sin ^4 cos q^, 

X 2 = cos q 2 sin <73 sin q^, X5 == gi sin qz sin ^4 sin qe, 

Xz = q\ cos q 2 cos qz, Xz == qi sin qz cos q^. 


The q system of coordinates is an orthogonal one, i.e., one for which 
the determinant g takes a diagonal form owing to the fact that all gi^H 
vanish for which i 5^ fc. The general formulas, 



0 for jfc 7*^ 1, 

1 for k - 


(36-12) 



240 DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 

then show that = 0 if t A: and that g'‘'‘ = 1/gkk- From Eqs. (35-6)< 
we obtain 

gn = 1, fl'44 = sin^^a, 

gn = giS gt.i = qi^ cos^ga sin^gs, 

fiTss = qi^ coH%, gee = qi^ mn^q^ 

gH = gr,« sin^ga cos^^a sin qs sin qe. 

Equation (35-6) yields 


^g.^dqf\gkk dqk/ 
gi® dqi\ dqx) 


+ 


1 


gi* cos‘-*g 2 sin ga dga 

+ 


gi^ sin^ga cos^ga c)g 

1 d 

gj^ sin'^ga sin g 4 dgV 
, 1 


a / . 


+ ■ 




1 


+ 

q^ sin-Q'a ^0^5" <7i‘^ sin2Q'2 sm-g'4 ^76“ 

The resulting wave equation, after reverting to the original symbols, is 
(32T8), if D is given the explicit form, 


d^\f/ 


(35*13) 


n 1 2 ^ f 

D = k-( sin^a cos^a — I -\ - — -- • I sin di ) 

Sin^or cos^Qf (9a!\ da/ cos^a sin d 0 i\ dOi/ 

, 1 d_( d \ 1 

cos‘*^« sin^^i d<p{^ sin^a sin 62 ^ ^ 


s(™ "■/».) 


0 * 

sin “a sin ‘'*^2 d^a" 
13514) 


For^„ Eq. (35'11) gives 




= i ^ [ 1 [ 


+ 

~ sin^ay 


i/1 9 ' 


+ 


/i*L cos”a\30i 


bs-a\c 


a 2 


+ 


1 d^ 


-p 1 H" csc^^i ^ 


a2_ ^ 1 + csc^^a ^ 


sin 2 a 2 dip 2 ^ 


sin'-^ai d(pi^ ' 4 

J re]}'^® ^(9)^9 = 

(35-16) 


It will be noted that all the cross-derivatives are eliminated by the 
transition from \px{q) to ^ 5 ( 7 ). This is true whenever the q system is an 
orthogonal one as may be seen frc^iti Eq. (35TX). 


36. DYNAMICAL VARIABLES IN GENERAL 

36a. Remarks on the Value of the General Theory. — The foregoing 
discussion of the operators for linear momentum, angular momentum, 
and energy cries out for generalization. One would like to know whether ^ 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


241 


every measurable physical quantity a depending on the state of a dynami- 
cal system is to be associated with a corresponding operator (q:)op and, 
if so, what properties are to be assigned to the class of operators represent- 
ing measurable physical quantities. 

It must be frankly admitted at the outset that an examination of 
the efforts which have been made to set up a general theory answering 
the above and related questions suggests at times the possibility that the 
game is not worth the candle. One of the chief fruits of this quest is the 
Dirac-Jordan transformation theory, which gratifies our natural desire 
for unity and (completeness and has considerable practical value, but 
like Dirac^s later formalism^ is of too heuristic a characcter to be ultimately 
satisfying. On the other hand the mathematically rigorous and elegant 
method of von Neumann involves a technique at once too delicate and 
too cumbersome for the practical purposes of the average physicnst. 

One’s doubts about these developments, with their emphasis on 
dynamical variables defined by arbitrary Hermitian operators, are 
emphasized by an examination of the dynamical variables actually 
measured for atomic systems. As a matter of fact, position, or con- 
figuration, linear mom(‘iitum, enc^rgy, and a single arbitrary component 
of magnetic moment arc the only independent dynamical variables 
whose measurement can be carried out in principle with arbitrary 
precision for an individual atomic system. As the magnetic moment is 
determined by the angular momentum and energy, one can say that the 
only variables which can be defined unambiguously in terms of operations 
we can approximate in the laboratory are those already considered, 
tog(^th('r with other quantities which are functions of them. Hence 
it becomes evident that all of the practical results of honrelativistic 
quantum theory might be derived with the aid of the explicit discussion 
of the small group of dynamical variables mentioned and their operators, 
without a general theory. 

Nevertheless, it is to be remembered that the development of scientific 
theory is always conditioned by artistic considerations of simplicity and 
symmetry, and by the urge for unity and completeness. Hence a proper 
expositicm of modern physical theory ought to include more than a bare 
analysis of the relation between the essential postulates and the experi- 
ments which have been performed or which may soon be performed. 
It must include an examination of efforts to reduce the theory to a 
compact, unified, and general scheme, even when thc^e efforts are only 
partially successful. In fact such an examination is not only desirable 
from a philosophical standpoint but is a practical necessity for the 
student to whom much of the literature of the subject would otherwise 
remain a closed book. Moreover, the attempts at generalization form 
an essential background for any endeavor to develop the theory to include 
1 J.e., the formalism of his book on quantum mechanics. 



242 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


a wider range of experimental facts. The reader is therefore invited 
to join the author in an attempt to generalize and unify the results 
already obtained. 

36b. Possibility of Defining Physical Quantities by Operators. — In 

view of the above remarks concerning the dynamical variables which are 
actually measurable for atomic systems — or for any system, provided the 
accuracy is sufficient to take the measurement out of the domain of 
classical physics into that of quantum physics — we shall assume that the 
answer to the first of the two questions raised at the beginning of the 
last section is affirmative. To be specific we postulate that every 
dynamical variable a defined operationally by an exact scheme of 
measurement can be correlated with a corresponding linear operator 
(a:)op. Postponing to Chap. IX an examination of the relation between 
the scheme of measurement and the properties of the operator in question, 
we turn our attention next to the question of the common characteristics 
of the class of operators {a)op associated with actually, or potentially, 
measurable physical quantities. 

These common (iharacteristics can be inferred by induction from the 
special cases of positional cooidinates, linear momentum, angular 
momentum, and energy previously examined. In every case^ the operator 
(a)„p is linear and Hermitian^ with respect to a class of functions which 
includes class D and which in the language of von Neumann {cf. Sec. 32c) 
is ^^everywhere dense’’ in the Hilbert space of all quadratically integrable 
functions. In all cases except those of the positioual coordinates there 
exists a complete discrete-continuous set of solutions of the equation 

{a)o:^(p{x) = a(p{x) (36T) 

in terms of which an arbitrary quadratically integrable function of the 
coordinates, say ^(x)j is mean-square expansible. The eigenvalues a' 
associated with the elements of this set have been identified with the 
values of the variable a which may result from an experimental measure- 
ment. They are all real, due to the Hermitian character of (a)op.^ 
The expectation value u for measurements of systems belonging to an 
assemblage in a state described by the wave function^ ^(x) is given in all 
cases by the formula /^*(a)op^dr = ((«)op^,^). To determine the proba- 

^ For certain purposes it is convenient to include with the Hermitian dynamical 
variable operators whose eigenvalues are reaf, other non-Hermitiau operators with 
complex eigenvalues. An appropriate generalization of the definition of a quantum- 
mechanical dynamical variable is given in Sec. Sfij. For the present we restrict our 
attention to the real, or Hermitian, case. 

* The ease where a' is a discrete eigenvalue is covered by the argument of p. 206. 
The case in which a belongs to a continuous spectrum will be dealt with on p. 276. 

* In this section the dependency of the wave functions on the time is of no impor- 
tance. Hence we use the symbol ^ for an arbitrary wave function, or instantaneous ^ . 



Sec. 36] 


DYNAMICAL VARlABLEIi IN GENERAL 


243 


bility of any discrete eigenvalue, say one expands ^(x) into a discrete- 
continuous linear combination of eigenfunctions and evaluates the 
norm, iV[6‘„^n] = (Cn^n,CniAn), of the term (or sum of terms) in the expan- 
sion which belongs to a„. This norm is the probability in question. 
In the case of continuous-spectrum eigenvalues one evaluates the prob- 
ability of a small range of eigenvalues da by a similar process. Thus 
(q:)op determines th(^ eigenvalues and unites with the state function ^{x) 
to fix their probabilities. 

In the case of the positional coordinates the allowed values and the 
probabilities of different ranges are fixed by our initial hypothesis regard- 
ing the physical interjjretation of and the assumption that ^ is a 
continuous function of the Cartesian coordinates. Hence it was not 
necessary to use an operator to find the eigenvalues and their probabilities. 
However, the formula (33T), or its instantaneous equivalent 

q = 

suggests the propriety of idc'iitifying (g)op with the multiplic^ation operator 
[g'Xj, and implies that this operator, along with must determine the 
probabilities of different rangevs of q values. The properties of the class 
of operators for the positional coordinates thus defined turn out to be 
sufficiently different from those of the other Hermitian operators studied, 
so that, strictly speaking, we cannot use the above dejscribed eigenvalue- 
eig(^nfunction mc^thod to predict the results of positional measurements 
made on systems in a state described by the known wave function 
Nevertheless, this difficulty can be overcome by a proper reformulation 
of the eigenvalue-eigenfunction problem, to which we shall later return. 
With the aid of such a reformulation it can be shown that the positional 
coordinates are no exception to the general rule that the operator asso- 
ciated with a dynamical variable determines its spectrum and, when 
used in conjunction with the wave function ^(x), can be employed to 
work out the appropriate distribution function giving the probabilities 
of different measured values. 

It will be observed that, inasmuch as the operator (a)op in any of these 
cases determines the result of measuring a for a large number of systems 
in any common definite state, it must implicitly determine the experi- 
mental procedure appropriate to measuring a. Hence we may say that 
a physical quantity a can be defined by the corresponding operator (a)ap. 

Now it is customary to define the method of measuring physical 
quantities without defining these quantities themselves. In fact we have 
no satisfactory reason for ascribing objective existence to physical 
quantities as distinguished from the numbers obtained when we make the 
measurements which we correlate with them. As indicated in Sec. 19d, 
there is no real reason for supposing that a particle belonging to an 
assemblage described by a wave function ^(x) has at every moment a 



244 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


definite, but unknown, position which may be revealed by a measure- 
ment of the right kind, or a definite momentum which can be revealed 
by a different measurement. On the contrary, wo get into a maze o^ 
contradictions as soon as we inject into quantum mechanics such concepts 
carried over from the language and philosophy of our scientific, ancestors. 
As no scheme of operations can determine experimentally whether 
physical quantities such as position and momentum exist and have 
unique values when they are not at the moment under observation, nor 
whether the number obtained by a measurement describes some objective 
property of the thing measured, a strict adherence to the operational 
point of view requires that we eliminate such concepts from our theories. 
From the standpoint of classical physics such a rejection would perhaps 
be a bit of philosophical purism flying so unnecessarily in th(^ face of 
common sense that few would care to adopt it. On the other hand 
this rejection is a demonstrable logical necessity for quantum 
mechanics.^ 

We bring up this philosophical question here because of its relation to 
our notation. It would evidently be philosophically more exact if we 
spoke of ^^making measurements^^ of this, that, or the other typ(^ instead 
of saying that we measure this, that, or the other ^^physical quantity.” 
Rather than make such a radical and awkward change in our phraseology, 
however, we can continue to use the old language, reinterpreting the 
terms ^'physical quantity” and ^^dynamical variable,” and allowing 
them to stand for the corresponding operator which fixes the nature 
of the measurement under consideration. We here adopt this lattc^.r 
procedure as a matter of convenience, interpreting the phrase ^^measure- 
ment of the z component of angular momentum,” for example, as 
equivalent to ^^making eigenvalue observations for the operator (£2)op.” 
In so doing we refuse to recognize any content to the symbol a 
which does not flow out of the definition of the operator (a)op and of the 
boundary and continuity conditions which fix its eigenvalues and eigen- 
functions. (The boundary and continuity conditions are themselves 
ultimately, as we shall see, implicit in the complete definition of the 
operator.) Hence we have no real need for the symbol a as distinguished 
from (a)op and can logically discard the complication of the latter symbol. 
Henceforth we shall accordingly drop our , original operator notation and 
substitute for the operator symbol (a)op the symbol a itself. 

The distinction between operators and their eigenvalues will not be 
entirely uniform. In some cases we shall use Greek letters for operators 
and Roman for their eigenvalues, but usually the eigenvalues will be 

1 This fact has been clearly brought out in the discussion of a recent paper on 
quantum mechanics and physical reality by Einstein, Podolsky, and Eosen [Phys. 
Rev. 47 , 777 (1936)]. See especially the paper by W.» H. Furry, Phys. Rev. 49 , 393 
(1936). 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


245 


indicated by the same symbol as that used for the operator, but with a 
prime or a subscript as a distinguishing mark. 

36c. The Transformation of Probability Amplitudes and Dynamical 
Variables. — It is ne(H\ssary to add that the concrete mathematical form 
of the operator corresponding to a given type of physical measurement 
must vary with the nature of the scheme of probability amplitudes 
for which it is to be used. Thus we have already seen that the operators 
//, Stz take on different forms for different coordinates systems. In 
fact (*v6Ty transformation of the wave function used to descTibe an 
assemblage of dynamical systems involves a (corresponding transformation 
of the operator for each ‘^dynamiccal variable. 

If we know how to transform probability amplitudes from one coordi- 
nate system to another, it is a simple matter to derive corresponding 
transformation formulas for the operators associated with a giv(m 
dynamical variable. Let yj/x denote a wave function or probability 
amplitud<‘ based on the f undam (mtal Cartesian-coordinate system 
:ri, • • • Let xl/^j denote a probability amplitude which describes 

the same state of the same ass(^mblage of dynamical systems in terms 
of a second set of independemt variables (/i, • • • ,gx. The transforma- 
tion of ypx into can be symbolized by the equation 

(q\ 

{<i\ 

where T "" denotes a linear op(crator wdiich transforms a function of the 
coordinates Xi, • • • jX^n into a function of the eigenvalues of the q 
coordinates, say -gi', • • • It is, of course, unnecessary that the 

number of independent variables in the q scheme of coordinates shall 
be the same as that in the x schc^me. The operators employed hitherto 
for dynamical variables transform functions of the x coordinates into 
new functions of the same coordinates. An operator of this type can 

be written as a , but the superscript can be omitted where, as hitherto, 
the context precludes any ambiguity. 

The transformation of yj/x into must be reversible if the two functions 
are to contain equally complete descriptions of the assemblage under 
consideration. We accordingly define the inverse, or reciprocal, trans- 
formation by the equations 

I ,36.0) 

The use of the term ‘probability amplitvde for is intended tp imply 
(c/. pp. 63, 236-236) that by summing ^,^g**over discrete eig^ 

' Cf. Hilbert, von Neumann, and Nordheim, Math, Annalen 98 , 1 (1927). 



246 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VJI 


values of the and integrating over the eontiniious ones it is possible 
to evaluate the probability of any range of eigenvalues in q' spare. It is 

convenient to introduce the symbol to denote this mixed process 



Fig. 15. — Illustrating the operation 
Zqf in two dimensions. In this simple 
case the circles represent points on the 
discrete spectra of <71 and q2, the vertical 
(and horizontal) lines are the loci of points 
on the discrete spectrum of and the 
continuous spectrum of (72 (and vice versa), 
while the crosshatched region belongs to 
the continuous spectrum of both variables. 
The diagram shows the discrete and con- 
tinuous spectra of qi overlapping — an 
unusual case. The process Sg' reduces 
to a simple summation for the region Gi, 
to the sum of three single integrals for Gz, 
to the sum of single integrals and con- 
tributions from discrete points in the case 
of Gs, and to a double integral for G*. 


of summation and integration. If it 
is to be extended over a domain G 
in the X-dimensional space in which 
the g'\s are laid oiit.as orthogonal 

coordinates, we use the symbol 

The number obtained by applying 

the operator to unity will be 
o 

referred to as the ^Volume’’ of (?. 

In case the process is to be 

extended over all of g' space, the sub- 
script G is merely omitted. 

If the probability of the domain 

G is to be ^ ,'*Pq4'g*, it is necessary 
a 

that shall be unity. The 

obvious parallelism between this 
expression and the s(;alar product 
suggests the desirability of 
defining the scalar product of two 
different probability amplitudes in 
g' space, say ^g(g') and ^g(g') by 

= (36-3) 


This type of scalar product shares the properties listed in Sec. 22 for the 
scalar product of two functions of the Cartesian coordinates and reduces 
to the standard forms (2216) and (22*13) in vspecial cases. 

(9) 

If the operator T is to convert a properly normalized probability 
amplitude in x space into a properly normalized probability amplitude 

in g' space, it is necessary that *= (JT From the 

(«) 

linearity of the operator T * , it follows that if the functions yj/xix) and 

ipx{x) are quadratically integrablC in x space their transforms by 
have an absolutely convergent scalar product in g' space, such that 

), (36*4) 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


247 


It is customary to define a unitary operator as a linear operator of 

the type a = which has an adjoint which is also its reciprocal.^ 
Let W denot(^ the adjoint of such a unitary operator f/ with respect to the 
linear manifold C and the domain of integration M {cf. Sec. 32d). If ^i(ar) 
and \l/ 2 (x) belong to C, it follows from the definition of the adjoint that 




the domain of integration for the scalar products being M. Thus U, 

like T ^ , effects a reversible transformation which preserves scalar 
products. In fact, if U is a reversible linear operator which maps the 
domain M on itself and preserves scalar products over that domain,’ 
it follows at once that it is unitary in the above sense. Hence it is 

(7) 

convenient to generalize the definition as follows: An operator T ^ is 

(x\ 

said to he unitary if it has an inverse (T~^) and preserves scalar products 
in the sense of (36*4). This definition makes all the operators which 
transform probability amplitudes from one scheme to another unitary, 
and is equivalent to our initial definition when applied to operators of 
(®) 

the type a * . The transformation produced by a unitary operator is 
called a unitary transformation. 

(^) (^) 

Let a * and a ^ denote the forms of the operator a appropriate to the 
X and q coordinate systems, respectively. Let \l/x{x) and ypqiq^) denote, 
respectively, an eigenfunction of a in Cartesian-coordinate space and its 
(7) 


transform by 




It follows that 


Consistency demands that 
( 2 ) 






In other words, a transforms according to the rule 


(36*5) 


Such an operator transformation is said to be canonical. 

The transformation of the Hamiltonian operator H in Cartesian 
coordinates to the form Hg given in Eq. (35-11) for any system of gener- 
alized positional coordinates qi, ^ 2 ', • • • is a special case of (36-5) 

(q\ 

in which is the operation of multiplying the function to be operated 

1 Cf., e.g., voK NittrMANN, M.G.Q., p. W 



248 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


on by the square root of the Jacobian 


d{xi,Xi, ■ 

• ) 

d((7/,98', • 

• • ) 


of the trans- 


formation — the g pf Sec. 35 is the square of this Jacobian — and sub- 
sequently replacing’ each of the Cartesian coordinates by the equivalent 
function of the g'’s. 

An important property of canonical transformations is that they 
carry Hermitian operators in one type of coordinate space over into 
Hermitian opi^rators in another. Let us assume, for example, that 

(a) 

a is Hermitian with respect to class D. Let ypA and ^|/B denote any two 
class D fun(^tions in x sj)ace. Then 


\I/b)x = (\pA, OL^''^ypB)x^ 


But 


(« 

Similarly, 

Therefore, 


(*)./. \ . _ /’ rp—i\(q) fjtix) 




(T-'y 


(q) 


/7,(J) ©, rpil), / ©, , ^ 

(7 a ^j/A, J = (a S^A, 4'b)x- 


= (iA, cPiB)x. 


(<i\ (<i\ (q\ (q) (q) (q) 

(se-e) 


(9) 

which means that a ^ is Hermitian with respect to q' space and the linear 

( 5 ) 

manifold of functions obtained by applying T ® to class D. 

The essential thing to be noted here is that, although a single operator 
acting on functions of the Cartesian coordinates generates a dynamical 
variable, there is a formal difficulty about identifying the variable with 
the operator, owing to the fact that the latter has many equivalents 
in other probability-amplitude schemes. , This difficulty is met by agree- 
ing that all operators derived from a single parent by means of (36-4) 
shall be regarded as different forms of one and the same operator. To 
avoid ambiguity an orthogonal Cartesian-coordinate system will be 
assumed throughout the following discussion where there is no definite 
statement to the contrary. 

36d. Type 1 Operators as Dynamical Variables. — As indicated in 
Sec. 36a, it is customary to apply the term dynamical variable^ in quantum 


^The term observable was introduced by Dirac in the first edition of his book 
(PxQ.M.y p. 25) with a meaning analogous to our term dynamical variable^ but referring 
to a particular instant of time. In the second edition the terms observable and 
dynaniical variable are both used in the same sense as the latter term is used by us. 
We also make use of the term observable, but apply it to the restricted class of dynam- 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


249 


mechanics not only to linear Hermitian^ operators which arc associated 
with actual or conceptual fjchemes of observation but also to other 
operators having similar mathematical properties, but not associated 
with such methods of measurement. In effect we include among the 
dynamical variables all operators which permit the determination of an 
eigenvalue spectrum and allow us to evaluate a distribution function 
which would be appropriate to the prediction of the probalnlity of the 
different eigenvalues for a given state function ^(a:) if a method of meas- 
urement were known. In defining the dynamical variables of quantum 
mechanics it is convenient to draw an initial distinction between those 
operators for which the eigenvalue-eigenfunction problem in Cartesian- 
coordinate space can be given the conventional form associated with 
(36-1) and those which, like the multiplication operators for the positional 
coordinates, require a modified formulation of the eigenvalue-eigenfunc- 
tion problem. W(^ designate the former as of “type and the latter 
as of “type 2.'^ The operator for a given dynamical variable can shift 
from type 1 to type 2, or vice verm, as w(' change from one scheme of 
I)robability amplitudes to another, but so long as we stick to functions 
of the rectangular coordinates, ypxy it is permissible to refer to the dynami- 
cal variables themselves as of type 1 or type 2, as the case may be. We 
concentrate our attention for the present on the first type. 

The degeneracy of the eigenvalue's and consequent ambiguity of their 
eigenfunctions makes it difficult to formulate the properties of the 
operators singly. In the cases so far studied, however, the type 1 
operators can be united into groups having complete systems of simul- 
taneous eigenfunctions which are nonde^generate. Each of these eigen- 
functions is associated with a corresponding set of eigenvalues and is 
uniquely determined by that set except for the usual constant of pro- 
portionality. By choosing this factor once and for all in some con- 
veniemt way consisteait with the usual normalization rules, wo obtain a 
perfectly definite system of simultaneous eigenfunctions of the operators 
of the group and a corresponding uniquely defined expansion theorem. 
In the case of a single particle in thrive dimensions, for example, the 
operators Px,PvjPz form such a group. The arbitrary set of eigenvalues 

^ 

Vx\ Py, Pz has no eigenfunctions except multiples of c * 

ical variables which are symmetrical functions of the coordinates and momenta of 
identical particles (c/. Sec. 40d, p. 310, and Sec. 426, page 339). 

Of course it is really illogical to apply the term dynamical variable to something 
which is not a function of the time for a dynamical system. It is no more illogical, 
however, than to apply the term atom, as we do, to systems of particles which are 
an5dihing but indivisible. Our excuse is that we have to do with a necessary generali- 
zation of the class of quantum analogues of the classical dynamical variables. 

^ The Hermitian property is convenient but not essential. The less stringent 
requirement that each operator a shall have an adjoint at with which it commutes 
gives a possible alternative restriction which allows complex eigenvalues. 



250 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


The operators //, JC* form a similar group for the two-particle problem, 
each set of eigenvalues having an essentially unique eigenfunction 

Not all pairs of operators can be united into such a group, as most 
pairs are incompatible in the sense that they have no simultaneous 
eigenfunctions. The mathematical conditions of compatibility will 
be discussed in Sec. 37. Here it will suffice to note that they are met 
in the above cases and to postulate that the operators in which we are 
interested can always be united into such groups. In case an operator a 
has a purely discrete eigenvalue spectrum, no postulate is necessary as 
the point is readily proved. Let \l/n, * • • denote an orthonormal 

set of eigenfunctions of a with the common eigenvalue ak. (If a has a 
complete system of eigenfunctions, we can form a complete orthonormal 
set \l/nk in an infinite number of ways.) Let us now define the linear 
operator by the equations 

Wj^nk) = hnirpyypnk) n, A? = 1, 2, 3, 4, • • • 
or the equivalent 

k n 

The constants hn are the eigenvalues of p, which is Hermitian provided 
they are real. The functions are simultaneous eigenfunctions which 
are nondegenerate since no two of them have the same pair of eigenvalues. 
This proves that a can be united with at least one other Hermitian 
operator, viz., I3, to form a group with a complete system of nonde- 
generate simultaneous eigenfunctions. 

In order to put the above statements on an exact basis we need unique 
definitions of the terms eigenfunction and eigenvalue for a general linear 
operator a. This need confronts us once more with the question of 
boundary conditions. 

In Chap. IV the discrete eigenfunctions of Sturm-Liouville equations 
with singular boundary points were picked out as solutions of these 
equations which satisfy the singular-point boundary conditions and 
hence belong to a linear manifold of functions with respect to which the 
Sturm-Liouville operator is Hermitian. The discrete eigenfunctions 
of the Schrodinger equation H\l/ =» Eil/ were identified with class D 
solutions of that equation, i.e., with solutions which belong to a narrowly 
defined linear manifold of functions with respect to which H is Hermitian. 
The continuous-spectrum eigenfunctions in turn were defined as bounded 
solutions of H\l^ which are not quadratically integrable but conform 
to the class D conditions, at all finite points. In Sec. 32j we found it 
necessary to introduce the postulate that the eigendifferentials formed 
from the continuous-spectrum eigenfunctions belong to a linear manifold, 
including Z), with respect to which H is Hermitian. 



Sec. 36] DYNAMICAL VARIABLES IN GENERAL 261 

If it were possible, one would prefer to use the class D boundary- 
continuity conditions to define eigenfunctions in all cases, but this 
procedure does not work. The eigendifferentials are not of this class in 
cases where the variables are separable, and in the case of some operators 
ordinarily classed as generators of dynamical variables, and having 
discrete eigenvalues, no class D eigenfunctions exist. Hence we must 
use a wider class than D. We accordingly require that the discrete 
eigenfunctions and the eigendifferentials shall belong to a linear manifold 
of functions with respect to which the operator in question is Hermitian. 

It is convenient at this point to introduce the term Hermitian manifold 
of a for a linear manifold of functions with respect to which the operator a 
is Hermitian when the domain of integration is properly specified. Unless 
there is a definite indication to the contrary, we shall hereafter assume 
that the domain of integration is all Cartesian-coordinate space. 

Unfortunately it is possible for an operator to have two or more 
Hermitian manifolds which cannot be united without destroying the 
Hermitian property. In such cases the op/^rator can have two or more 
independent complete systems of eigenfunctions with different spectra. 
This possibility has been demonstrated by von Neumann^ in the case 
h d 

of the operator — when the domain of integration is finite. It seems 

probable, however, that any two Hermitian manifolds of a which contain 
the manifold of physically admissible functions D are capable of union 
into a single Hermitian manifold. At any rate we assume the existence 
of a unique Hermitian manifold of the operator a which (‘ontains class D 
and every other Hermitian manifold which contains D. Let Da denote 
the manifold thus specified. We shall at time^ refer to it as the type D 
Hermitian manifold of a. 

Definition: of is said to belong to the discrete spectrum of the Hermitian 
operator a if there exist one or more nontriviaP solutions of the equation 
a<t> = a'(t> which belong to the Hermitian manifold Da. 

Definition: a" is said to belo?ig to the type 1 continuous spectrum of the 
Hermitian operator a if there exists a continuous family of nontrivial 
solutions of a<t> = «'<#> for values of a' in the neighborhood of a" such that 

<l>{x\a^)da' belongs to the domain Da and does not vanish identically 
when 7j is made arbitrarily smalU 

1 Of. M.O.Q.y p. 79. In comparing statements in this reference with those made 
here the reader is cautioned to note the difference between his method of specifying 
an operator and ours (see footnote 1, p. 263, below). 

* 7.6., solutions which do not vanish identically. 

* The vertical bar in the symbol is used here and will be used hereafter to 

separate two arguments, or two sets of arguments, of the function in question which 
belong to different classes. 



252 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


Definition: 0(a;i, • • • yXz7,\oLi, * • * , ax') = <l>ix\a') is a simul- 
taneous eigenfunction of the Hermitian operators ai, * * • , ax with the 
eigenvalues ai', ’ • • , ax' provided that it is a continuous function of all 
the continuous-spectrum eigenvalues which satisfies each of the equations 
ak<t> = ak(t>) a.'^d provided that the function derived from 0 by integrating 
with respect to each of the continuously variable a'^s over the range of values 
contain’d in a small cubical element of side rj in a space containing the point 
a/, • • • , ax' belongs to the type D Hermitian manifold of each of the as. 
The function A2^<t) is then said to be an eigendifferential of the set of a\s*. 

Thus, if ai', a 2 ' belong to continuous spectra, while the others belong 
to discrete spectra, the function 

^:4> = f?' ''"daC rfnaC<t>(^\aC,a,",as', • • • ,ax') (36-7) 

•/at ft/ors 

is an eigendifferential provided that it belongs to the type D Hermitian 
manifold of each a. A discrete eigenfunction may be regarded as a 
special case of an eigendiffer(‘ntial for which the number of integrations is 
zero. Formula (36*7) defining is conveniently generalized to 

(36-8) 

where G{a\ri) is a volume element in a" space having the form of a 
‘4iypercube” with a corner at the point ai',a 2 ', * • • , ax', and made 
small enough to include not more than one discrete eigenvalue of any 
variable. By analogy with (32-25) wc adopt the normalization rule 

= f f \^S><t>\^dxi • • • dxin = v”', (36-9) 

where m denotes the number of continuous-spectrum eigenvalues in the 
set ai', • * * , ax', t c., the number of integrations involved in forming 
A24^ The quantity r?”* which forms the right-hand member of (36*9) 
is seen to be the volume of (7(a',t7) as defined in Sec. 36c. Thus is 

the average value of <^>(a;|a") for the element G{a',ri), 

The orthogonality of the eigendifferentials of a set of operators which 
belong to different discrete eigenvalues of some one of them, say ak, is 
a consequence of the requirement that these eigendifferentials belong to 
the Hermitian manifold of ak (cf, p. 206). In view of the work of Sec. 30 
we should expect that two eigendifferentials A2'(t> and will also be 
orthogonal if the ranges for any one of the continuously variable a^s do 
not overlap. A rigorous proof of this proposition based on the Hermitian 
property is not easily formulated but has been carried through by Carle- 
man.^ The general orthogonality condition can now be identified with 

* T. OAKiiEMAN, Thiorie des Equations Integrates Singttlihres d Noyeau Riel et 
SymmitriqueSf Upsala, 1923. 



Sec. 36] - 


DYNAMICAL VARIABLES IN GENERAL 


263 


the statement that is zero if the regions G{a\ri^) and G{a'\ri^') 

used in defining the eigendifferentials in question do not overlap. This 
includes as special cases all three of the orthogonality conditions (32-26). 

If w(a") is a continuous function of the arguments ai",a 2 ",a 3 ", 
• • • , ax", wherever these arguments are continuously variable, 


is approximately equal to u(a')Al^<t> for small values of rj. Hence we 
may plausibly assume that 

^ i)]- (36-10) 

On this assumption it follows that 

lim [ ivT-^^uW'MxW'), A2'-.A^ j = M(«0. (36-11) 

Furthermore, 

(a') «;*(«'). (36-12) 

0 


Equation (36-12) may be regarded as an extended form of the normaliza- 
tion-orthogonality condition to be applied to the eigenfunctions of type 1 
operators. 

Let the Fourier coefficient c(a') for the arbitrary quadratically 
integrable function ^{x) be defined by 

c(a') = lim AJ, </>)]. (36-13) 

7 ] — >0 

In case the eigenvalues in the set a' are all discrete, the function (p(x\a^) 
is quadratically integrable, and (36-13) reduces to 

c(a') = (iA(a:), *;»(x|a')). (36-14) 


A similar reduction can be made if \l/(x) is absolutely integrable — a class D 
function, for example — while is bounded. Equations (36-11) and 
(36-12) now yield the inequality 

A - V c(a')<i>, ^ - ]^„c(«')A = (44) - ^ 

\ K K / K 

where K is any finite region in a' space involving only a finite number of 
discrete eigenvalues of any a. 

The simultaneous eigenfunctions 0(a:|a') are said to form a complete 
system if 

a 4) = 


(3616) 



254 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


when is any quadratically integrable function of the Cartesian 

coordinates, the operation being extended over all a' space. This 

relation is seen to be equivalent to the statement that we can approximate 
as closely as desired in the least-squares sense by means of an 

expression of the form As a corollary on (36T5) it 

follows that if i/A and \I/b are two different quadratically integrable func- 
tions with the Fourier coefficients Ca and cb, rcspe(^tively, 

(Mb) = (36-16) 

This result can be derived by the method employed in the proof of (22-32). 

The reader will recognize in (36*16) an extrapolation of (36*12). 
The validity of (36*16) for any pair of eigendifferentials is 

evidently a direct consequence of the normalization-orthogonality relation 
for eigendifferentials. 

We are now in a position to lay down a satisfactory definition of the 
type 1 operators whose properties were roughly prescribed on p. 249. 

(®) 

Definition: A type 1 operator a = a is a liriear operator which converts 
functions of the Cartesian coordinates into functions of the Cartesian coordi- 
nates and has the following properties: 

(a) a has a Hermitian manifold Da which includes the class of physically 
admissible functions D a7id. every Hermitiafi manifold which contains D. 

(fc) CL either has a system of nondegenerate eigenfunctions which are com- 
plete in the sense of Eqs, (36* 15) and (36* 16) , or can he united with one or more 
additional operators of the same type to forrn^ a set which has a complete 
system of nondegenerate simultaneous eigenfunctions. [An eigenfunction is 
understood to be nondegenerate when there is no other eigenfunction 
with the same set of eigenvalues which is linearly independent of it. 
When normalized, it is uniquely determined except for an arbitrary 
constant factor of absolute value unity (phase factor).] 

Any two operators which can enter into a common set of the above sort 
are said to be mutually compatible. When the simultaneous eigenfunc- 
tions of any set of mutually compatible operators are actually non- 
degenerate, the set is said to be complete {cf. p. 287). It is to be observed 
that the completeness of a set of mutually compatible operators is wholly 
different from the completeness of a system of functions. 

The set of dynamical variables to which such a complete set of 
mutually compatible type 1 operators gives rise constitutes what we shall 
call a type 1 coordinate system. 

As previously indicated (Sec. 266) the root-mean-square convergence 
of function ^(ir),f .e,, the validity of (3646), is by 



Sec. 36] 


DYNAMICAL VARIABLE,^ IN GENERAL 


255 


no means sufficient for a rigorous proof that the series converges at 
every point on the function ^(x). Furthermore the point by point 
expansion 


'Pix) = (36-17) 

is not absolutely necessary in quantum mechanics provided that the 
relation (36*15) is available.* However, this expansion, or an equivalent, 
is essential to the development of the formalism of tl>e Dirae-Jordan 
transformation theory. Hence it is a source of satisfaction to know 
that the investigations of Weyl in one dimension (Sec. 306) strongly 
suggest that the right-hand member of (36*17) does converge uniformly 
to the value yl/{x) provided that the latter function is (umtinuous and 
belongs to the Hermitian domain of the a^s. If this suggestion is correct, 
the expansion (36*17) is good for all physically admissible functions ^(.x) 
when the a’s form a type 1 coordinate system. 

A more cumbersome alternativ(i method for calculating ^(x) from the 
Fourier coefficients c(a:') is to use the formula 

ypix) == lim (36-18) 

where 0 ,(x|q!') is the mean value of <^(a*|a') over a hypercube of side r) 
in X space and contaiping the point in question. It is not difficult to 
show that this equation is a direct consequence of (36*15) if ^(x) is 
continuous, even if the series (36*17) is not uniformly convergent. 

For this purpose it is convenient to make use of ^‘strp functions ” defined as follows. 
Let w(x) be any quadratically integrable function of the Cartesian coordinates Xi, 
• • • , Xsn. liCt Wn{x) be a function derived from w{x) by dividing coordinate space 
into hypercubes of side rj and replacing w{x) at each point by its average value in 
the hypercube to which the point belongs. We call the “smoothed” function Wnix) 
obtained in this manner the step function of w{x) for the given system of hypercubes 
because in one dimension the graph of Wn(x) would have step form. I et Ar]{x) denote 
the difference w(x) — Wrjix). The scalar product of Arf(x) and y'r,(x) is zero since the 
contribution of every individual hypercube to that scalar product is zero. Hence 

iw(x), w(x)) * (Wr,ix)y Wff(x)) -f (A„(a;), An(x)), 

Consequently 

\\wix)\\ ^ ||«^^(a:)ll. (36.18a) 

Let us now introtluce step functions ^v(^) 4h)(x\a') formed from \l/(x) and 

0(x|a') with the same system of hypercubes. Finally, let Fy^{XyR) denote the step 
function formed from the difference 

F{XyR) « ^(x) - ^^,<^(«l«')c(a0, 

1 It does not appear in the von Neumann formulation of the theory to be described 
briefly in Sec. 3Qf* 



256 DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 

where G is identified with the interior of a hypersphere of radius R in a' space. Clearly 

Fy,(XfR) = — ^^,ft>n{x\a')c{a'). 

G 

From the completeness relation (36*15) we know that lim ||F(a;,I?)|| = 0. It follows 
' It -* » 

from the inequality (36* 18a) that lim ||F,,(a:,i^)|| = 0. But Fr,(x,R) is piece-by-piece 

R—* 06 

continuous. Consequently lim ||/'\(ar,/g)|| = 0 implies lim Fr,{XyR) = 0. Thus 

00 H—* 00 

If the hypercube side r? is now allowed to approach zero, the continuity of ^(x) demands 
that lim Mx) == ^(x). This proves (36*18). 

TJ — >0 

If \l/{x) is not continuous, we have only to replace the left-hand 
member of (3618) by lim \p 7 ,ix)f which can differ from only at points 

» j -+0 

of discontinuity and must have the same Fourier coefficients as \l/{x). 
For physical purposes there is no need to distinguish between and 
lim \p 7 ,(x)j for every physical prediction made from a wave function 

tf — >0 

involves a process of integration with respect to which they are equivalent. 

The above justification of (36*18) suggests a valuable reinterpretation 
of (36*13). The quantity which appears in the right-hand 

member of (36*13) can be identified with a step function formed from 
<t>ix\a') by introducing hypercubes in a' space and replacing <t> in each 
hypercube by the corresponding mean value. The parallelism between 
(36*13) and (36*18) then becomes complete. 

36e. Calculation of Probabilities. — ^Let us turn our attention next to 
the physical interpretation of the functions ct){x\a') and c(a') which 
appear in (36*17). We shall refer to each of the functions <l>(x\a) and to 
any discrete-continuous linear combination of these functions involving 
but a single value of ak as an eigenfunction of ak. Class D eigenfunctions 
of ak, if any, are interpreted as descriptions of physically possible sub- 
jective states with a unique value of the variable ak, viz., ak- We assume 
that a measurement of ak for any member of an assemblage of identical 
systems in such a state must necessarily yield the single result ak. In 
general the eigenfunctions of au, whether discrete or of the continuous- 
spectrum type, are not of class D. According to our basic assumption 
they do not represent physically realizable subjective states, but each 
can be regarded as the limit of a sequence of physically admissible 
functions. Hence we shall refer to them as the wave functions of states 
with unique values of ak, even though we can never realize them exactly 
in practice. 

A simultaneous eigenfunction of two or more variables must then 
represent a state in which each of these variables has a unique value. 



SSec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


257 


When a complete system of such states exists, there is no theoretical 
reason, like the Heisenberg inequality of Sec. 16, why simultaneous 
observations of arbitrary accuracy should not be made on all the variables 
in the group. Hence it is usual to speak of mutually compatible dynami- 
cal variables as simultaneously measurable, although we may have no 
concrete plan for carrying out a simultaneous measurement. 

In Sec. 15 we used an operational definition of linear momentum to 
develop a scheme for calculating the relative values of different measured 
values of the momentum. The scheme has been generalized for the 
prediction of energies, and we here carry the generalization a step farther, 
postulating that the probability that the variables ai, a^, * • • , ax, 
which make up a complete set of mutually compatible operators, have a 

set of values in the domain G of a' space is Mathematically, 

this means that c(a') plays the role of probability amplitude for the 
coordinates^^ ai, a^, • • • , ax. In the notation of Sec. 36c c(a') 
becomes ypaipt'). Equations (36-13) and (36-14) give explicit form to the 

operator while (36-18) does the same thing for 

Physically this postulate means that if a simultaneous exact measure- 
ment of all the could be carried out for a large number of identical 
physical systems, so prepared that they have a common initial state 
described by ^{x), the most probable value of the fraction yielding 

measured values of the q!\s corresponding to points in G is 

If we do not know how to carry out such a measurement, we shall still 
refer to the above expression as the probability of the domain G in a! space. 
In other words, this is our mathematical definition of the probability in 
question. In view of (36-15) it satisfies the fundamental requirement 
that th(^ sum of the probabilities of all possible sets of simultaneous 
eigenvalues is unity. Also the probability of a set of discrete eigenvalues 
is unity in the case of a simultaneous eigenfunction with those eigenvalues. 

To get the probability that an individual variable ak has a value in the 
range a*." < a/ < ak" we have only to identify G with the entire domain 
in a' space defined by the double inequality, thus summing over all 
possible values of the other variables in the a coordinate system. 

It tollows from the above procedure for calculating the probabilities 
of the eigenvalues of that if the variable ak has an expectation or mean 
value for systems in the state that value must be given by the 
expression 

( 36 - 19 ) 

By hypothesis a type 1 operator ak must be Hemutian with respect to a 



258 


DYNAMICAL VARIABLES AND OPERATORS (Chap. VII 


linear manifold of functions £)«*, which includes D. It follows that if 
4/{x) is a class D function (physically admissible), the expectation value of 
au for the state yf/^x) is also given by the formula 

dk = (36*20) 

To prove this proposition we note that if ^(x) belongs to £>«*, is 
quadratically integrable so that the scalar product (ak\pj\p) exists and can 
be evaluated by means of (36*16). Let d(a') denote the general expansion 
coefficient of ak\p(x) with respect to the <^(a:|a:')^«* With the aid of 
(36*10) and the Hermitian property we deduce 


dia') == lim Al^<t>(x\a))] 

yf — >0 


lim 7j yp(x), ^ ,,ak'<t>{x\a^')\ = akc{a^), 

V aP.n) /J 


(36*21) 


By (36*16) the right-hand members of (36*19) and (36*20) are equal, as 
was to be proved. 

The theorem of the equivalence of Rlqs. (36*19) and (36*20) is the 
ultimate justification of the restriction of physically admissible functions 
to class D and the restriction of dynamical variables to operators Her- 
mitian with respect to class D. A second important inference from- 
(36*21) is obtained by reverting to the notation of Sec. 36c, in which it 
takes the form 

It follows at once that 


OLk 


(S) 


Ha') = r® 


= cck'Ha'). 


(36-22) 


Thus the operator ak"*^ is the operator which multiplies \p{a') hy au. 

In concluding this discussion of type 1 operators we call the reader's 
attention to the fact that we can define an operator in terms of a complete 
system of normalized orthogonal eigenfunctions instead of deriving the 
system of functions from a previously known operator by (36*1). If the 

functions 4>(x\a') are given, the corresponding operators T and 

are defined by Eqs. (36*13) and (36*18). The operator can be 
identified with the multiplication operator [a^'X] and uk is then obtained 
by reversing (36*5). Thus 


OLk 


- lim 2 Mx\a')aMa'). (36-23) 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


269 


A plauHil)le generalization of the well-known Fischer-Riesz theorem^ 
suggests that for the existence of a quadratically integrable transform 

ak it is sufficient as well as necessary that the sum-integral 

(36-24) 


shall exist. 

36f. Type 2 Operators as Dynamical Variables; the Method of von 
Neumann. — The positional coordinates form the basic dynamical vari- 
ables of classical theory, and of quantum theory as well. Our original 
physical interpretation of the quantity together with the 

assumption that the wave functions ^ are defined and continuous over 
all Cartesian-coordinate space, implies that the spectrum of possible 
measured values of each Cartesian positional coordinate ranges from 
— 00 to + 00 , and that all of these coordinates are simultaneously measur- 
able to any desired precision. It also defines by implication the range 
of every classically legitimate positional coordinate g{xi, * * • , :r. 3 n). 
Finally it t(dls us that the primary probability amplitude \l/(x) has the 
same relation to positional measurement that c(a') has to measurements 
of the as. Hence we have no immediate need to write down operators 
for the positional coordinates and work out their eigenfunctions except 
as a means for unifying tlu' theory of the type 1 dynamical variables 
with that of the type 2 variables of which the positional coordinates 
are typical. 

If we do wish to unify the theory, we can at once identify the operator 
for a i)ositional coordinate g(xi, * * * , x-^n) with the operation of mul- 
tiplying by the number q{xij * * • , Xsn). We symbolize this operation 
by [</X]. This identification, suggested on p. 243, is confirmed by 
Eq. (36*22). Any eigenbinction of the equation 


# = [^X]^ « (36*25) 


must accordingly vanish except at points on the surface 

qixi, * * * , X3n) = g'. (36*26) 

Although the operator is linear and Hermitian with respect to the mani- 
fold of all quadratically integrable functions with quadratically integrable 
transforms, it is impossible to set up a corresponding complete normal 
ortliogonal system of eigenfunctions. In fact the solutions of (36*26) are 

1 Cf, VON Neumann, M^G.Q.j p. 16; F. Riebz, Comptes Rendus, 144, 616-619 (1907); 
E. Fischer, Comptes Rendus, 144, 1022-1024 (1907). The theorem in the original 

form given by Riesz for a sequence of real numbers c» states that if is convergent, 

n*»l 

and if 4 > 2 , * * * form a complete orthonormal set of functions, there exists a 
quadratically integrable function, say /, such that Cn = for every n. 



260 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


so discontinuous that the ordinary normalization formulas for continuous- 
spectrum eigenfunctions cannot be fulfilled. Any attempt to represent 
an ordinary wave function as a continuous linear combination of eigen- 
functions of the operator [gX] must fail, because any integral of the 
form i(i{q*)yl/qf{x)dq' must vanish due to the circumstance that for any 
fixed set of values of the Cartesian coordinates the integrand must 
vanish except at the isolated value of q* defined by (36*26). 

We shall briefly describe two schemes for dealing with the above 
difficulty. The first and more rigorous method is due to von Neumann.^ 
It involves a reformulation of the eigenvalue-eigenfunction problem which 
permits us to deal with all dynamical variable operators on the same basis. 
The von Neumann formulation rests upon the observation that in the 
case of a real dynamical variable a of either type 1 or type* 2 , it is possible 
to resolve any function in the Hcrmitian manifold of a into the sum 

of two parts, say and each belonging to the same Hermitian mani- 
fold and such that rpi is associated with the portion of the spee^trum of 
eigenvalues belowr an arbitrary value <r, while ^2 is associated with the 
portion above that point. The resolution has the important property 
that, independent of 

^ ipL\l/2,4'2) > 

By making a number of such cuts we can resolve ^ into any desired 
number of parts associated with different non-overlapping intervals 
of the spectrum. (In the case of a multiplication operator [q^X] the 
eigenvalue spectrum is identified with the totality of the values which 
the multiplying factor is allowed to take on.) 

Let a denote an arbitrary type 1 operator and let jSi, 182 , * • * j 
denote a complete set of type 1 operators of which a is a member (a = /S*). 
Let Ea{<T) denote the operator defined by 



= lim / ]£ lim (36-27) 

Here the notation is that of Eqs. (36*7) to (36*18). The symbol 

denotes the application of the operator 2 )^/ l^hat portion of /S' 

space for which a ^ o*. The operators in the one-parameter family so 
defined are clearly applicable to every quadratically integrable function 
^(x) and must yield quadratically integrable transforms when applied 
to such a function. It follows from (36*12) that the Fourier coefficient 

* VON Netoann, M,G.Q,, Kfi^. II, Ziff. 6,7,8. 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


261 


function c(a') for the primary function u[x,(t) = &(< 7 )^(z) is identically 
zero in the region a! > a but is equal to the Fourier coefficient function 
for itself in the region a! ^ <r. Equation (36*18) in conjunction 
with the paragraph following the fine print on p. 256 shows that for 
physical purposes we may consider that a function is defined by its 
Fourier coefficients. Hence the application of the operator jEa(o') to 
Ea{(T)\l/{x) leaves the function unchanged. 

Ea(a)[E^(a)yl^] = Ea{(T)^p. (36*28) 

By Eq. (36*16), 

* (36-29) 

a' 

Hence 

(Ea(cr)\l/A, ^b) == (^A, Ea(or)\pB)- (36*30) 

Here \I/a{x) and i/aix) are any two quadratically integrable functions of 
the Cartesian coordinates. Thus Ea{(r) is Hcrmitian with respect to the 
manifold of all quadratically integrable functions. Operators with the 
properties defined by Eqs. (36*28) and (36*30) are called projection 
operators. 

The operator Eaic-) also has the properties indicated by the following 
self-explanatory equations : 

Ea{cT")Ea(<j')^P = Ea{crW(r")4' = Ea{a')^P; <t' ^ Ct" (36*31) 

\\Ea(a'M g \\Ea{(r"m; <r' g cr" (36*32) 

lim Ea{(T)yf^{x) = 0; (36*33) 

o— ►— 00 

lim Ea{(r)\l/{x) — ^(x). (36*34) 

von Neumann calls a family of projection operators which have the 
properties (36*31) to (36*34) a resolution of unity. 

None of the properties listed in defining a resolution of unity makes 
specific connection with the operator a from which Ea{(T) was derived. 
In order to make such a connection in a generalizable form we note that 
Eq. (36*23) can be written in the form 

a4>{x) = (36-35) 

where the right-hand member is a Stieltjes integral.^ An alternative 

1 The Stieltjes integral is a slight generalization of the ordinary Riemann integral 
frequently employed by 4)hysicists. To define an integral of this type over an 
interval a < x < b of the x axis, we imagine the interval divided into n equal parts 
x^ < X < x.r+1 and set 

h ^ 

J u(x)dv(x) * lim X«(Xr)[t/(Xr+i),- w(Xt)I. 

a II— ^ 

If i;(a;) is differentiable, the Stieltjes integral reduces to the ordinary Eiemannian 



262 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


equivalipnt scheme for relating a with the resolution of unity Ea(<r) is- 
to use the equation 

= /_V<r (36-36) 

which follows directly from (36*16), (36*21), and (36*29) provided that (p 
is quadratically integrable and ^ belongs to a Hermitian manifold of a 
which includes the eigendifferentials A2/</>. 

Let Na denote a Hermitian manifold of the above type. As a corol- 
lary on (36*36) we have the equation 

(M,a\p) = <72(il|^a(<r)\AllS (36*37) 

valid for every xp in Na. The right-hand member of this equation must 
therefore converge for every su(;h function xp. It can also happen that the 
right-hand member, which is defined for every quadratically integrable xp, 
is convergent for functions not in Na. The right-hand member of 
(36*37) plays the part of the sum of the squares of the absolute values 
of the Fourier coefficient in the Fischer-Riesz theorem (cf. footnote 1, 
p. 259). Hence, by analogy with that theorem, we may assume that if 
the right-hand member of (36*37) is convergent^ there exists a correspond- 
ing quadratically integrable function w(x)f whose Fourier coefficiemts 
are determined by 

(36-37a) 

Now two functions, whose Fourier coefficientt^ with respect to any 
complete orthogonal system are the same, can differ only at points of 
Lebesgue measure zero, i.e.j at a set of singular points which are of no 
importance when one forms a definite integral over any domain. As 
noted at the end of Sec. 36d two functions which differ only in this way 
are indistinguishable for the purposes of wave mechanics. It follows 
from (36*36) and (36*37a) that if the function axp is defined, it can be 
identified with w. If it is not defined a 'priori^ we can define it as identical 
with w. If <p belongs to Na we see from (36*30) that 

QQ 

<Td(4>,EaW)(p) = 

integral u(x)^dx. In (36-36), however, the function Ea(o)^(x) is not continuous 

in cr or differentiable with respect to <r at the discrete eigenvalues of a. In the case 
of the corresponding formula \cf. Eq. (36.38)] 

there is a discontinuity isit f $ for every value of q. 




Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


263 


Thus we can add the function }// to the Hermitian manifold Na without 
spoiling its Hermitian character, provided that the right-hand member 
of (36-37) is convergent. Let Ma denote the most inclusive Hermitian 
manifold obtainable in this way by extending Na- Then is identical 
with the manifold of all quadratically integrable functions for which the 

<r^d\\Ea{<T)\f/\\^ is convergent. 

Thus far we have defined ^a(<r) by the explicit formula (36*37). 
von Neumann, however, defines the resolution of unity associated with a 
Hermitian operator by its properties, i.e., by Eqs. (36*28), (36*30) to 
(36*34), and (36*36). To be precise, he shows that if the operator a 
has a He^rmitian manifold Na which is everywhere dense in the Hilbert 
space of all quadratically integrable functions (c/. Sec. 32c, p. 201), 
and if EJ{(t) and EJ\(t) are two resolutions of unity for which (36*36) 
holds, jjrovided that \f/ belongs to Na and ip is quadratically integrable, 
then EJ{(t) = Ea"{<r). In other words the resolution of unity associated 
with a given operator operating on a given Hermitian manifold of func- 
tions is unique. The working out of such a resolution of unity is von 
NeumannV equivalent of the eigenvalue-eigenfunction problem. At the 
end of this section it will be proved that Ea(cr) defines a spectrum of 
eigenvalues and in conjunction with a wave function ^(a:) fixes their 
probabilities. 

Contrary to the usage of this book von Neumann includes in the 
definition of each Hermitian operator a the specification of a definite 
Hermitian manifold Na to which alone he considers it applicable.^ 
Hence he can say flatly that a given operator has at most one resolution 
of unity, whereas our definitions admit the possibility of two or more 
different resolutions associated with different Hermitian manifolds. In 
order to remove the ambiguity thus introduced into our theory w'e shall 
exclude from consideration for physical purposes all Hermitian manifolds 
not of type D (c/. p. 251). 

1 We define the Hamiltonian operator H used in the hydrogen-atom problem by 



With this operator von Neumann associates a second, say which is identical with 
H for all functions which belong to a manifold Mu with respect to which 

H is Hermitian. H is not defined for solutions of = Eyj/ which do not belong to 
Mu, but can be and is defined for all non-differentiable functions which make 

convergent. Whenever von Neumann speaks of an Hermitian operator he means 
one of this type. Every solution of the equation H\p « Eip is a discrete eigenfunction 
of Hy but if we wish to find these solutions we must first investigate solutions of 
a* E}// and then inquire which of them belong to the manifold Mu. 



264 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


There is no difficulty in formulating a resolution of unity for a mul- 
tiplicative operator [gX] as well as for an operator of type 1. It is 
only necessary to presuppose that qrp is quadratically integrable when rp 
belongs to class D. In that case [gX] does have a type D Hermitian 
manifold whicli includes every other Hermitian manifold. 

The family of operators Eq(a) defined by 


Eg((r)\P(x) = \P{x), 

Eq{(T)\p{x) = 0 , 


q{x) ^ (t) 
q{x) > <rf 


(36-38) 


is readily seen to fulfill all the requirements for a resolution of unity 
Furthermore Eqs. (36*36) and (36*37) remain valid when [gX] and Eg{(r) 
are substituted for a and JEa(o-), respectively, provided only that ^ belongs 
to the Hermitian manifold of [gX]. Consequently (36*38) solves the 
problem of defining a resolution of unity for the operator [gX]. 

The question now arises whether, or not, the von Neumann eigenvalue 
problem is solvable^ not only for multiplicative operators and for operators 
known a priori to be of type 1, but for all operators Hermitian with respect 
to a manifold of functions which includes class Z). This question has not 
been considered by competent mathematicians, although a somewhat 
more general form of the existence problem for Hermitian operators 
has been treated by von Neumann. It appears from his work that we 
cannot assume the existence of the desired resolution of unity for every 
Hermitian operator. Hence it is necessary from this standpoint to 

(®) 

require that if a linear operator a with a type D Hermitian manifold 
is to represent a real dynamical variable in Cartesian-coordinate speux, 
a corresponding resolution of unity must exist. We make this the definition 
of the operator^ representing a real dy namical variable in Cartesian-coordinate 
space and define the dynamical variahU itself as the class of operators 
(®) 

generated by a using transformations of the type (36*5). 

It is now possible to define the probability of the eigenvalues of a in 
an interval I: <t' < a ^ or", very simply in terms of Ectia). Let Fot{I) 
denote the operator EaW') — EaW). Fa(I)\p is the part of ^ which 
belongs to the spectrum interval 7. In harmony with the rule of p. 257 
we identify ||iF’a(7)^l|'^ with the probability of values of a in the interval I 
for systems in the state rp. In other words the interval da has the 
probability d^Ea{(r)\p\\^- This definition is evidently in harmony with 
Eqs. (36*19) and (36*20). It works as well for type 2 operators as for 
those of type 1. The definitions of discrete and continuous-spectrum 
eigenvalues are seen to be implicit in the definition of the probability 
of a range of eigenvalues. If we plot the probability of the range I 
as a function of the upper limit cr", we obtain a monotonically increasing 
function which in general — Le,, for wave functions containing contri- 
butions from^ all parts of the spectrum — has finite discontinuities at 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


265 


discrete eigenvalues, a positive slope at points on the continuous spec- 
trum, and zero slope at points which are not eigenvalues. Figure 16 
llustrates such a plot. 

36g. The Method of Dirac and Jordan. — The second common way of 
unifying the theories of the type 1 and type 2 operators is to make use of a 



Continuous spectrum eigenvalues 
Fid. 16. — Plot of l|Fa(/)^||2 against upper limit of range 1. 

nonrigorous formalism for operators of the second variety which parallels 
that already developed for operators of the first type. 

In developing the procedure we consider first the variable x in one 
dimension. The function 

1 0, if X g x\ 

I ifr'<x^x' + v, (36-39) 

V 

0 , U x' + rj < X^ 

approximates the properties of a discrete eigenfunction of the operator 
[icX] if v is a small quantity. It is quadratically integrable, being 
normalized to the value = l/v- If describes a state in which the 

variable x cannot differ from the mean value x = x' + by more than 
3^ry. The average value of the square of the error x — xis 



If we divide the x axis into an infinite number of equal intervals 4 defined 
by 

Xr < X ^ Xr^l, rCr+l 

and set <p^f{x) — ip%{x), we obtain an infinite set of mutually orthogonal 
approximate eigenfunctions. This set is not complete, but if we allow to 
approach zero it becomes more and more nearly so, while its members 
become better approximations to the ideal of a true eigenfunction. 
Thus, let 

c<’”(av) = ,¥’'?) = - f ^ 4^{x)dx. (36-40), 



Fio. 17. — Plot of ^(t) and the step-function approximation to Exisf)^{x). 

With the aid of the approximate eigenfunction we readily 

derive a form for the projection operator Ex{(t) of (36*28) which in similar 
to that used for type 1 operators. The product {x)cS^^ {Xr)v denotes a 
^^rectangular'' function of x which vanishes outside the interval 

Xr < X < Xr + V . 

and inside the interval has an ordinate equal to the mean ordinate 

of in the same interval. Taking the point x = o* as one of the 

division points Xr, we see that denotes a step function 

x^<<r 

which approximates \f>{x) for values 6f x less than a (</. Fig. 17). 

Let c<’»>(a:') denote the step function whose value in the interval 
av < a;' < av + is c<’')(x,). Then 


J&,(<r)^(a:) = lim ]£,^i.’”(z)c(’'>(av)ij 
L x'^» J 


= lim 


“ L*'<o 


'>,(x)c<'l'>(x''. 


(36-44) 


The parallelism with (36-27) is (jompiete. It follows that the probability 



267 


Bbc. 36 ] DYNAMICAL VARIABLES IN GENERAL 

of the range x' < x < x" for the wave function ^(a:) is 

rf||S*(tr)^(x)||“ = lim r |cw(a;)p| = VJWdx. (36-46) 
L X^<X<X" J 

There are sets of eontinuoiis approximate eigenfunctions of [xX] 
which can be used in similar fashion. Examples are the error function 

Unix - x') = (36*46) 

suggested by Hylleraas^ and the furu^tion 

, .V sin n(x — x') 

w„{x - x') = - (36-47) 

used in setting iij) Dirichlet's integral and iho Fourier integral theorem. 
These functions are normalized to the values and irn, respec- 

tively. They are not orthogonal but become more and more nearly so as 
n becomes infinite. Thus, if x' 9 ^ x", lim {un{x ~ x'), Un{x — x")) = 0. 

«— ♦ 00 

It is not difficult to set up parallels to (36*42), (36*43), and (36*45) with 
the aid of either of these additional types of approximate eigenfunction. 
If any of the sequences of functions (p^V{x)y Unix — x'), Wnix — x') 


were uniformly convergent over the entire interval — cc < x < + 00 as 1 /??, 
or n, becomes infinite, we should have a corresponding limiting function, 
say 6(x — x'), with the properties 

[xX]5(x — x') •■= x'5(x — x'), 

(36-48) 

*5(x ~ x')dx == r ^ *5(x — 

J— 00 CO 

x')dx' = 1, (36-49) 

— ^')dx = fix'), 

(36-50) 

f^^dix — x')5{x — x")dx = 5(x' — x"). 

(36-51) 


Of course the convergence fails at the point x = x' and no fuii(‘,tion 
6(x — x') with these properties really exists. Nevertheless the existence 
of a sequence of approximations such as <^i5^(x) is for quantum-mechanical 
purposes tantamount to the existence of a 5(x — x'). This hypothetical 
function is called the Dirac^ b function and will be used as it were genuine 
— a procedure justified by the fact that, if we apply the formulas devel- 
oped for type 1 operators to Cartesian coordinates and employ eigen- 
functions assumed to have the properties (36*48) to (36*51), we get the 
same results as if the method of von Neumann had been used. 

The reader will note that the 5 function is not quadratically integrable, 
as follows from the normalization of the aipproximation functions, and 
that it has the type of normalization introduced in Sec. 30 for continuous- 
spectrum eigenfunctions [r/. Eq. (30*20)]. 

^ E. A. Hyllebaas, Orufidlagen der Qmntenmechanikf p. 57, Oslo, 1932. 

* P. A. M. Dirac, Proc. Roy. Soc. AU3, 621* (1927); P.Q.M., 2d cd., p. 72. 



268 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


Attention may be called at this point to the fact that the coordinate 
symbols which form the arguments of the wave functions are not opera- 
tors but numbers representing possible eigenvalues of the coordinates. 
The notation will therefore be improved as regards consistency and 
clarity if we introduce double primes as well as primes for eigenvalues 
and reserve the unpriraed symbols x, y, z for operators. Equation 
(36*48), for example, is preferably written in the form 

x8{x' - x") = [x'X]8(x' - x") = x''8{x' - x"), (36*52) 

Nevertheless we shall frequently omit the primes on the arguments of the wave 
functions when no serious ambiguity is likely to result. 

*36h. Multiplication Operators in Many Dimensions. — A discussion 
of the three-dimensional case will perhaps sufficiently illustrate the 
general problem of multiplication operators in many dimensions. Let 
the reversible transformation 

q' = x' = ,p{q' / 

r' = g{x',y',z'), y' = e{q',r',s')\ (36-63) 

s' = h{x',y',z'), z' = 

map x^iV'iZ^ space in a one-to-one manner, with the possible exception of 
singular domains of zero volume, on a domain Q of q'y^s' space. Let the 
typical multiplication operator g be defined by 

q = [?'X] = U(x',y',z')X] == /([a:'Xi,l3/'X],lz'X]). 

The eigenfunctions of q in x'jy',z' space arc then solutions of the equation 

f{x',y',z')yp{x',y',z') = q'’i{x',y’,z'), (36-64) 

and may be assumed to have the form 

^{q' - q^^)F{x\y',z') ^ 8[f{x^,y\z^) ~ q^']F{x\y',z'). 

Simultaneous eigenfunctions of the three operators < 7 ,r,s have the form 
5(g' ~ q")8{r^ — r")5(s' — s^')F{x\y\z'), the factor F being chosen to 
give proper normalization. Here g', r', s' are functions of x', y', z' defined 
by (36*53). 

It will simplify matters if at this point we introduce the simplified Dirac 
notation for probability amplitudes. Let {x',y'fZ'\q"y'yS") denote the 
simultaneous eigenfunction in x',y'jZ' space of the operators g, r, s for 
the eigenvalues g", r", s'\ Thus ^ 

(x',y',z'Ig",r",s") = 8(q' - q")8(r' ~ r")8{s' ~ s")F{x\y'yZ') (36*65) 

in adopting this notation let us agree that the density factor (c/. p. 239) 
in the normalization integral shall be unity in x\y',z' space. Conversely 
we designate a simultaneous eigenfunction of the operators Xy y, z in 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


269 


q'',r",s" space by {q" ,r" ,s'*\x' ,y' ,z') with the understanding that the 
density function shall be unity in q",r",s" space. Similarly we write 
for rpiz' ,y' ,z') and {q'',r",s"\) for the corresponding probability 
amplitude in q",r",s" space. 

As a normalization condition we use the relation 


Kq” - q’")Sir" - r'")Bix'' - s'") (36-56) 

in analogy with Eq. (36*51). Introducing Eq. (36*55) and transforming 
the integral over space into an equivalent over 7 '/', s' space, we have 


jdq'dr'ds'DFHiq' - q")B{q' - q"')&{r' - r")S(r' - r'") B{s' - s") B{s’ - s'") 

= B{q" - q"')S{r" - r"')S{i" - s'"), (36-57) 


where D is the functional dc^terminant or Jacobian d(x',i/', 2 :')/d((/',r',s'). 
The equation is satisfied if we set = 1 or identify F with the square 
root of the Jacobian of the q\r'js' coordinate system with respect to the 
x\y\z* system. Thus 


rd((/V ',50 
^ ” [d(;r',yV) ' 


(36*58) 


An expansion of the arbitrary wave function (x', 2 /',s'|) into simul- 
taneous eigenfunctions of q, r, and s would have the form 


{x',y',z'\) = fj,x',y',z'\q'',r'',s''){q'',r'',s''\)dq''dr''ds''. 


(36*59) 


Such an expansion follows directly from the properties of the b function 
if we introduce the symbol T for the substitution 

x' <p{q',r',s'), y' e{q',r',s'), z' xiq',r',s'), 


and define (?",r",s"|) by the equations 

/ // .//IN _ r^(^ yV ^\~V(x" v" z''\'\ 

= fy',r''/'\x',y',z'){x',y',z'\)dx'dy'dz’, (36-60) 

{q",T",s''\x',y',z') = " *">• 

(36-61) 


Equation (36-61) evidently defines a properly normalized simultaneous 
eigenfunction of the variables x, y, z in q'',r",s" space exactly analogous 
with Eqs. (36-55) and (36-58), while Eqs, (36-59) and (36-60) form 
parallels to Eq. (36-17) and the Fourier-coefficient formulas (36-13) 
and (36-14). 



270 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


The two real functions {x',y',z'\q",r",s"), q{" ,r" ,s"\x' ,y' ,z') arc 
equivalent and may be set equal to each other, for they yield the same 
result when multiplied by an arbitrary function of either set of variables 
and integrated over the corresponding coordinate space. 

By direcit transformation we readily verify the scalar-product relation 
[cf. Eq. (3616)] 

jl^{x',y',z'\A){x',y',z'\B) Hx'dy'dz' 

= ({q",r",s"\A)W'/',s"\B)*dq''dr"ds", (36-62) 

JQ 

as well as the moan-value theorem 

q = ^j'{x',y',z')\{x',y',z'\)\Hx'dy'dz' = y'\{q",r",s"\)\'^dq"dr"d^". 

(36-63) 


Thus the functions {x\y\z'\) and (g",/*",s"|) can be identified, respectively, 
with the functions ^|/x and of Eq. (35*9) 

36i. Transformation of Probability Amplitudes from One Arbitrary 
Coordinate Scheme to Another* — It is now evident that the formalism 
developed for type 1 coordinate systems can be adapted to iiu*lude 
ordinary positional coordinate vsystems. This can be done either with 
a system of approximate eigenfunctions, or, less rigorously, with the 
aid of the d function. The essential feature of this formalism is that 


(") 

it gives a unitary operator T which can be used to transform any 
probability amplitude \l/x{x') in Cartesian-coordinate space into an 
equivalent probability amplitude in a space in which the eigenvalues 
of the new coordinates are laid out ^‘at right angles to one another. 

In Eqs. (36-13), (36*18), (36*41), and (36*44) the operator and its 
inverse are expressed as the limits of sum-integrals when rj approaches 
zero. Equations (36*14), (36*17), (36*50), (36*59), and (36*60) give 
these operators the more convenient form of direct sum-integrals but 
are of less generality and rigor. 


fx) 

We are thus led to conjecture that every linear operator which 
conforms to the condition laid down on p. 264 — and hence forms the repre- 
sentative in X* space of a real dynamical variable a — can he brought within 
the scope of the type 1 formalism. This conjecture implies (a) that the 
system of eigenvalues of any such operator can he made to form one element 
of a complete coordinate system ai, • • • , ax' connected with the 


(*) 

CartesianrCoordinate system by means of a unitary operator jP ; (b) that 
ih4 operator T * and its inverse can be expressed as the limits of corresponding 



Sac. 36 ] DYNAMICAL VAIllABLES IN GENERAL * 271 

sum-integrals (c) that by means of the 5 function or similar devices it is 

always possible to express the operators as direct sum-integrals. 

These assumptions form an essential basis for any attempt to pass 
from the von Neumann formulation of quantum-mechanical theory to 
the Dirac-Jordan transformation theory. So far as the writer is aware 
no one has carried through the details of a complete proof of the validity 
of the assumptions (a) and (b), but it is possible to go a long way toward 
formulating such a proof and there is apparently no very good reason to 
question these items seriously. The third assumption is more doubtful 
and should be clarified by a more complete specification of the allowed 
procedure. Nevertheless we shall adopt it as extremely useful for 
heuristic purposes, while cautioning the reader to check by more rigorous 
methods any suspicious results obtained from postulate (c). 

We proceed to summarize the formulas implied by assumptions (a) 
and (b), recasting them at the same time in the interests of symmetry. 
Let ai, of 2 , • * • , oix denote any set of mutually compatible dynamical 
variables with a sequence of sets of definite approximate simultaneous 
eigenfunctions, say <l>(x'\a')r, for which the properties of orthogonality 
and completeness are valid in the limit as the parameter ri approaches 
zero. The probability amplitude in a' space corresponding to the wave 
function (x'|) = (a*/, * * * , :r 3 n'|) i« 

(a'l) = = lim f <ji{x'\a)f{x'\)dxidx 2 

= lim V 

The reverse transformation is given in all cases of interest, where (x'|) is 
at least piece-by-piece continuous, by 

(x'l) = = lim V (36-66) 

,-.0 

Here ^{x'\a'). is in all cases closely related to (^(x'|a')i> and in Eqs. (36-41) 
and (36-44) — ^which becomes a special case of (36-65) when <r = + so — 
the two functions are sensibly the same. In case the a’s arc of type 1 
[cf. Eqs. (36-13) and (36-18)], the functions <l>(x'\a% and ^(x'|aOi are 
mean values of the exact simultaneous eigenfunction {x'\a') over hyper- 
cubes of side 1? in a' space and x' space, respectively. Since 17 is ultimately 
to be set equal to zero, we can in each case substitute for <l>{x'\a')t/ or 
ip(x'\a')„, as the case may be, a step function {x'\a% defined as follows. 
Let the combined space of the a'’s and the x'’h be divided up into hypen- 

> la the future we shall refer to a unitary operator which conforms to the 
assumption (b) as a unitary integral operator. 


- • - dXsn' 

(36-64) 



272 


DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 


cubes of side j;. In each hypercube we give (x'\a% a constant value 
equal to the average value of ix'\a') in that hypercube, viz., 


{x'\a')„ = 

Xx’,« ^ 

Hypercube 


(36*66) 


A similar process is readily carried out for variables of type 2 and we 

conclude that in general the transformations and can be 

given the symmetrical form^ 

^ i: 


(..'D = = lim V (a'\xW\)i 

tj-^O 

'h = = Hr 

17— *0 


(36-64a) 

(x'l) = (r-0^“^(a'|) = lira y ,WUa'\), (36-65a) 

«-*n 

where (a'|:r')i 7 is defined by 




(36-67) 


As the Cartesian coordinates have continuous spectra ranging from 
— 00 to + 00 , the operator in (36‘64a) is just a multiple integral. 

If the operators a also have purely continuous spectra, in (36*65a) 

also reduces to a simple multiple integral, but if some or all the a^s 
have discrete spectra it takes the form of a pure summation, or a mixed 
summation-integration process as tlie case may be. 

Adopting the assumption (c), i.e., interchanging the order of the 

operations lim and V , or V , as the case may be, we throw the 

17— >0 ^<x 


(«) _ 

1 In view of the symmetry of the formxilas for and {T it would appear 
unnecessary to preserve the awkward notation {T for the latter, the symbol 


being adequate. Such a contraction of notation is not desirable, however, as 

is not uniquely defined by the two sets of coordinates involved. If we multiply 
the functions (x'\a')r, by an arbitrary phase factor which in the limit 1 ; ~ 0 

becomes a continuous function of the a' eigenvalues where they form a continuous set, 
we obtain a new approximate probability amplitude as good as the original one but 


yielding a different transformation operator, say As indicated in Sec. 366 the 

physical results derived from one choice of phases is the same as from another, but it 

(®) 

is necessary to be consistent and the notation (T~^) is a reminder that the choice 
of {leases in Eqs. (36*64a) and (36*66a) must be the same; otherwise one oj^erator 
would not be the inverse of the other. 



Sbc. 361 DYNAMICAL VARIABLEN IN GENERAL 

transformation equations into tlie form 
(a'l) = 

r'h = = 


273 

(36-68) 

(36-69) 


(x'l) = (r-‘)'“'(a'|) = 

with the understanding that {ot'\x') will have tlu^ properties of a 6 function 
if one or more of the a’s are dynamical variables of type 2. In view of 
(36-67), it is necessary that 

{x'\a') = {a'\x')*. 

Let us now consider a third set of independent variables • 

{») (d) 

transformation operator T Let T ' dtaiote tlie operator 

It is linear and has the inverse 

Since 


(36-70) 

, i^at with a unitary 

(3671) 

(3672) 


(36-73) 

(0) («) (0) 

T is unitary. Finally it transforms the operator = [ak X] into ak^' by the 

(0), (0) 

rule of Eq. (36-22). Thus has the same general properties as and we should 
expect to be able to give it a similar form. In fact 

ri(2)/ /IN ^ 


= lim Vo'|:r'),rHin V ^{x'\a')^{a'\)] 

= lim V ,(^'la'),,(a'|), 

• T7-^o 


where 


= 2^,(rl*'),(x'k)„ = (a'k)/. 

Using the postulate (c) we get the simplified equations 


(36-74) 


(36-75) 


(36-76) 


which we shall ordinarily assume to hold for any two complete sets of coordinates. 

Whether we give the transformation the form (36-64a) or (36-68) 
the application of the term probability amplitude to (a'|) implies that the 
probability of the eigenvalues in the region G of a' space for an assemblage 
of systems in a state described by (x'|) is to be defined as 


QiG) = Sa'K“'l)l“-- 


(36-77) 



274 


DYNAMICAL VARIABLES AND OPERATORS IChap. VII 


The probability of an individual discrete eigenvalue ak, or of an individual 
elementary range dak\ are similarly dejSned as Qiak) and Q(ak)dak\ 
respectively, where 

QM = • • ■ ' ■ ■ ■ 1 ) 1 '- 

Here the symbol means the application of the operation ^ 

to the entire range of values of each of the a' coordinates except ak. 

So far we have made no general statement about the range of appli- 

cability of the unitary operators . If the are all of type 1, 

it follow^s from the completeness of the system of simultaneous eigen- 
functions [cf, Eq. (36-15)] that if ^(^') is quadratically integrablo, the 
(«) 

transform T exists and is quadratically integrable. Conversely, 

by the previously assumed extension of the Fischer-Riesz theorem 
{cf, footnote 1, p. 259), and remark following Eq. (36-37) on p. 262), 
it follows that if the probability amplitude c{a') is quadratically sum- 

mable its transform \p{x') == {T~^) c(a') exists and is quadratically 

integrable. In case the a\s are multiplication operators, Eq. (36-60) 

(") 

shows that T ® reduces to the operation of multiplying by the square 
root of the appropriate Jacobian determinant and substituting for each 
Cartesian coordinate the equivalent function of the a^s. Here again 

ret) /x\ 

the operators T * and {T'~^) are defined and yield quadratically 
summable transforms provided that the function operated on is quad- 
ratically summable. Hence we shall assy me that this property holds 
in the most general case. ^ 

Consider next the Hermitian manifold of the real dynamical-variable 
(®) 

operator a = a . If belongs to this manifold, it is necessary that 
ypA and ayj/A shall be quadratically integrable. Then the transforms of 
these functions in a' space are {a^\A) and a'{<x^\A). If ypA and if/B satisfy 
this necessary condition, it follows from Sec. 36c that 


(a4^A,\pB) = 2^,tt'(a'|-4)(a'|R)* - (}l/ A, axl/a). 

Thus, the necessary condition is sufficient and it follows that the operator 
a is Hermitian with respect to the manifold of all quadratically integrable 
functions with quadratically integrable transform)^. 

We can now prove our right to apply the operator a term by term, or 
under the integral sign, to the expansions (36-65a) and (30*69)^ provided 



Sec. 36] 


DYNAMICAL VARIABLES IN GENERAL 


276 


that the function expanded, i.e. (x'l), has a quadratically integrabie 
transform by a. In that case 

a{x'\) = (r-0^“^aT®(x'|) = 

= lim V ,(3;'|a')i“'(“'l) ~ Yj (3679) 

whicli establishes the proposition. 

In Sec. 32d we gave a proof of the reality of the discrete eigenvalues of 
the standard Hamiltonian operator based on its Hermitian character 
and immediately applicable to any other Hermitian operator. We 
have not yet given a general proof of the reality of the continuous-spec- 
trum eigenvalues, though the discussion of the continuous spectrum as 
the limit of a discrete spectrum in Secs. 30 and 32j would lead us to infer 
that reality. The dcisired proof is readily established, however, by 
means of the transformation to an a space. Thus if {x'\) belongs to 

the Hermitian manifold of a and (a'|) is its transform by we have 


(a(x'|),(x'|)) - ((x'|),a(x'l)) = XM - («')*]|(a'l)P = 0. 


This equation must hold for every (a'|) which together with a'(a'|) is 
quadratically summable. Hence a' — (a')* must vanish for all values 
of a', i.e,f a' is real. 

36j. Dynamical Variables with Complex Eigenvalues.^ — The defini- 
tions given on p. 251 of the Hermitian manifold of an operator a 
and of its eigenvalues are converted into d('finitions of the unitary 

(x\ 

manifold of the unitary operator TJ = U and of its eigenvalues by 
substituting IJ for a and unitary for Hermitian throughout. The 
discrete eigenfunctions of U are defined as solutions of f/^ = which 
belong to the unitary manifold of U. The orthogonality of the eigen- 
functions of such an operator can be proved in the same way as for an 
Hermitian operator. In fact the* whole theory of the transformations 

fct\ /x's 

^ through equally well if we allow the operators a, 

which make up the coordinate system, to be either unitary or Hermitian. 
The eigenvalues of a unitary operator, however, are in general complex 
and of absolute value unity. To prove this we note that if is the 
inverse of U and <l> is an eigenfunction of U with' the eigenvalue w, 

U-^U4> ^ uU-^<l> = <l>. 

1 The complex eigenvalues of non-Hermitian operators here introduced must not 
be confused with the complex eigenvalues of Hermitian operators used in the theory 
of rMioactlve disintegration (Sec. 31/). 



276 DYNAMICAL VARIABLES AND OPERATORS [Chap. VII 

It follows that is also an eigenfunction of with the eigenvalue 
Then 

(U<t>y<l>) - = [w — *](<!> j<ti) = 0. 

Therefore = w, or |i^| = 1 . A more gemeral proof applicable to 

the continuous spectrum can be worked out with the aid of a transforma- 
tion to a space in which the eigenvalues of IJ an^ coordinates. 

It now becomes evident that the real dynamical-variable operators 
are a part of a wider class of linear operators which can also be thrown 
into multiplicative form by a suitable canonical transformation but whose 

(■«) 

eigenvalues are in general complex. Let T ^ denote any unitary integral 
operator defined for every quadratically integrable function {x'\)y and 
let /(g') denote any real or complex function of the coordinates in the q 

(< 7 ) 

space defined by T * . Let y be the operator in Cartesian-coordinate 
space defined by 

y = y^-^ = . (36-80) 

7 (a;'|) is defined and quadrati(;ally integrable if 

%\f{q')T^'\^W (36-81) 

converges. Let (x'\A) and (x^\B) denote two quadratically integrable 

functions with quadratically integrable transforms (^'|A), respec- 

tively. Let 7 t be defined by 

yt = (36-82) 

Then 

(r(a:'iA), (:c'|B)) = = ((x'M), yKx'\B)). (36-83) 

Thus 7 "^ is adjoint to 7 with respect to Cartesian-coordinate space and 
the linear manifold of all functions which make the expression (36*81) 
convergent. We call this the adjoint manifold of 7 . Finally, we gen- 
eralize the term dynamical variable to include the class of operators 
composed of 7 and its canonical transforms, provided that the adjoint 
manifold contains the manifold of physically admissible wave functions D, 
The eigenvalues of the general dynamical variable 7 are the possible 
values of /(g') and the eigenfunctions are defined by the transformation 

operator when that is thrown into the form (36*64) or (36*68). 
The eigenfunctions of 7 ^ identical with those of 7 , but the eigenvalues 
of 7 and 7 + for any given eigenfunction are complex conjugates. The 
real dynamical variables form that subclass of the general d 3 miamical 



Sec. 36 ] 


DYNAMICAL VARIABLES IN GENERAL 


277 


variables which are Hcrmitian and have real eigenvalues. By sub- 
stituting a real function for f{q') w(! can always derive a real dynamical 
variable with the same system of eigenfunctions from any given complex 
variable y. Hence the eigenfunctions of the general dynamical variables 
must always have the same orthogonality properties as the eigenfunctions 
of real dynamical variables. 

So far as concerns the theory of measurements it would suffice to 
restrict the discussion to variables with real eigenvalues, but in the mathe- 
matical discussion of the symmetry properties of th(^ Schrodinger equation 
the concept of the broader class of variables is useful. 



CHAPTER VIII 


COMMUTATION RULES AND RELATED MATTERS 

87. SIMULTANEOUS EIGENFUNCTIONS AND THE COMMUTATION OF 
DYNAMICAL VARIABLES 

37a. Operator Algebra. — The sum and product of two linear opc'rators 
are defined by the equations 

(a + (37-1) 

(37*2) 

The multiplication of an operator a by a number c follows the rule 

{ca)\l/ = c{a\l/) = (37*3) 

These operations conform to the rules of ordinary algebra except for the 
commutative law of multiplication and so define an algebra for linear 
operators similar to the algebra of matrices. 

If the functions ^ and x belong to the adjoint manifold of a, 

(ca\l/f x) = c(^, a+x) =" 

Hence ca has the adjoint the adjoint manifold being identical with 
that of a. If a is a dynamical variable, ca is also a dynamical variable. 
If a is a real dynafni(‘al variable, i,e,, is Hermitian, ca is a real dynamical 
variable if, and only if, c is real. 

Let us now assume that a and are dynamical variables and that 
^ and X belong to class D. It follows that 

((a + My x) = x) + x) 

= (^y Oth) + M) = («■>■ + ^^)x)- 

Hence a + has an adjoint manifold which includes class D. To the 
additional question whether a + gives rise to a solution of the von 
Neumann eigenvalue problem, or has a complete orthonormal set of 
eigenfunctions, because a and 0 separately have that property, we can 
oflPer no positive answer. However, von Neumann has shown that 
Hermitian operators which do not yield such solutions are to be regarded 
as exceptional, whereas unitary operators always yield corresponding 
resolutions of unity. ^ Hence it is to be expected that in most cases the 
appropriate resolution of unity will exist. If this hypothesis is correct, 
a + is a dynamical variable. Furthermore a + iS is Hermitian and a 
real dynamical variable if a and /S are Hermitian. 

* VON Neumann, M,G,Q.y 11, 9. 


278 



Sec. 37J COMMUTATION AND SIMULTANEOUS EIGENFUNCTIONS 279 

The case of the product of two dynamical variables involves further 
uncertainty. We cannot be sure that will be quadrati(jally integrable 
if and are quadratically integrable. However, the severe restric- 
tions on class D require that and shall be quadratically integrable 
provided that ^ is of class D and provided that a and arc positive powers 
of the Cartesian coordinates or of their conjugate momenta. Let us 
assume that a and are dynamical variables of this basic classical type. 
Then if ^ and x belong to (Jass D 

(oifiip, x) = oex) = jSax)- 

Thus a(i has an adjoint manifold which includes class D. If we make the 
optimistic assumption that afi yields a solution of the von Neumann 
eigenvalue problem, we may infer that is a dynamical variable. It 
is Hermitian if, and only if, the operator aP and its adjoint are 
equivalent. If a and ^ are Hermitian, it suffices that they commute. 
If they are Hermitian but do not commute, we can always form a Her- 
mitian operator from the products afi and fia by taking their sum, for 

((ajS + ^a)\l/y x) = + ^a)x)* 

In the limiting case of a sharply defined wave packet the average value of 
be identifieid with the classical value of the product of 
the variables a and {cf. Sec. 13), so that this symmetrized Hermitian 
operator forms the quantum-mechanical analogue of the classical product 
variable. 

There is evidently a large class of classical dynamical variables 
which can be correlated with quantum-mechanical dynamical-variable 
operators. We have no reason to suppose, however, that every function 
of the coordinates and momenta defining a classical dynamical variable 
can be correlated with a quantum-mechanical dynamical variable, 
nor is it altogether certain that when such a correlation exists it is unique. 

Theorem: Any algebraic relationship between two or more operators 
based on Eqs. (37T), (37*2), and (37 '3) is preserved when the operators 
are subjected to a canonical transformation of the type (36*5). The proof 
is left to the reader. 

37b. Functions of a Single Operator. — Equations (37T), (37*2), and 
(37-3) give meaning to the various powers of an operator a and hence to 
any function of a expressible as a polynomial or power series. Consider 
the simple case 

f(a) = c\a + C 2 a^ 

where C\ and C 2 are any complex numbers. If denotes a discrete 
eigenfunction of a with the eigenvalue Un, we have 

f((x)4^n ** CiOC^n + C^(a\l^n) ~ (OlUw + C^an^ypn =* f(an)^n* (37‘4) 



280 COMMUTATION RULES AND BELATED MATTERS [Chap. VIII 


Thus is an eigenfunction of f(a) as well as of a, provided that it 
conforms to the appropriate boundary conditions. In the most general 
case this means that must belong to the adjoint manifold of /(a). 
If f{a) is real for every value of an, it means that \pn belongs to the Her- 
mitian manifold of /(a). 

Let us (jonsider the most general case in which a is required to be a 

• • • ( 

dynamical variable with an adjoint a^. Let T * denote once more a 
unitary integral operator defining a canonical transformation which 
converts a to multiplicative form. Then 

a s a® = 

Equation (37*4) is seen to be equivalent to 

fia) = (37-5) 

or 

/(a)(x'|) = Um 

il-*0 

Let /(a)'f be defined by 

/(a)t = 

It follows [cf. Eq. (36-83)] that /(«)+ is adjoint to f{a) with an adjoint 
manifold which includes all eigenfunctions and eigendiff^rentials of a 

together with all other functions \p(x) such that con- 

verges. If \f{a)\ has an upper bound, it is evident that the adjoint 
manifold of f{a) will contain all class D functions, so that /(a) generates 
a dynamical variable. The condition is sufficient but not necessary. 
In any case the eigenfunctions and eigendifferentials of f{a) are the same 
as those of a, while the eigenvalues are related by /(a)' = /(a')- 

Equation (37-5) gives a scheme for defining functions of an operator 
with a complete orthonormal system of eigenfunctions which is alter- 
native to the method of building up the function by the processes of 
multiplication and addition described in Sec. 37a. The same result is 
obtained if we adopt the method of von Neumann [cf. Eq. (36-35)] and 
write 

f{a)ix'\) = f;^y(<T)d[Ea{cKx'\)]. (37-6) 

If every eigenfunction of an operator a is also an eigenfunction of the 
operator it follows that all eigenfunctions of a with a common eigen- 
value a' are also eigenfunctions of with a common eigenvalue fi'J 

^ Otherwise we could choose a linear combination of eigenfunctions of a with a 
eommon eigehvalue which would not be an eigenfunction of 



Sec. 37] COMMUTATION AND SIMULTANEOUS EIGENFUNCTIONS 281 


In that case each eigenvalue of is a function of the corresponding eigen- 
value of a and we say that is a function of a. If two or more different 
eigenvalues of a correspond to the same eigenvalue of the relation is 
not a reciprocal one and a is not a function of /?. If a is not a function of 
jS and jS is not a function of a, the operators a, arc said to be independent. 

As has been emphasized by Dirac, functions of an operator defined by 
means of (37*4) and (37*5) can be just as discontinuous as functions of a 
real or complex number. Furthermore, this equation can be used to 
define the square root of an operator, or, in many cases, the reciprocal 
of one.^ 

If the operator ot is Hermitian, it is easy to choose /(«') so that the 
function /(a) defined by (37 *5) is unitary. For this purpose it is sufficient 
that /(a') shall map the axis of reals on the unit circle in the complex 
plane, so that |/(a:')| = 1 if a' is real (r/. p. 275). 

A particularly imj)ortant case* is obtained by setting f{x) = 
where X is a real parameter. We denote the corresponding operator 
function of X and a by Then 


^Ie-(x1)l - Ikt- 






Thus the function xp = is automatically a solution of the differen- 

tial equation 


1 dyp 
a\p = ~ 

t dX 


which reduces to the form (a:^'|) when X is zero. If we set X = — 

and identify a with the Hamiltonian operator H we secure in this way a 
solution of the second Schrodinger equation. 

If the function is analytic in X, it is evident that by a repeti- 

tion of the above differentiation process we can derive 




(x'l) = 








X«() 


i (fXg) ” 
I n\ 


(x'l). 


Thus in this case the operator is equivalent to its formal power- 
series development (c/. Sec. 326, p. 201). 

37c. Cooimutative Operators. — In general the product of two opera- 
tors depends upon the order of their application, i.e., the commutative 

1 In the case of an operator a which has the discrete eigenvalue zero, the equation 
a~hfin = UtT^n has no meaning when a. is zero. However, if 0 is in the continuous 
spectrum of a, «“* can still exist. 



282 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


law of ordinary multiplication does not apply. Consider, for example, 
the basic operators 

r vyi h b 

In this case 

imu - qkPi)^ = (37-7) 


or 


ViQk — QkPi = 


27 ^^ 


‘8ki- 


(37-8) 


This ^‘commutation^’ rule is characteristic of the behavior of so-called 
conjugate dynamical variables in quantum mechanics and will be the 
subject of further discussion in Sec. 39. 

If a and /3 are any two operators, the operator — a^) is called 

the Poisson bracket^ of a and ^ and is indicated by the symbol [a,^]. 

M - ««. (37-9) 

Thus Eq. (37*6) is usually written in the foim 

iQhPi] ~ (37T0) 

Definition: Two operators a and are said to commute if their Poisson 
bracket yields zero when applied to any qiuidratically irdegrable functions for 
which it is defined. When a and /3 do commute in this sense we can set 
[a,0\yp equal to zero for every ^ whose transform by [a,^] is not defiped 
a priori j and so insure that the transform of every quadratically integrable 
function by shall exist apd vanish identically. We shall assume 
hereafter that a arid always commute when = 0 for every ^ which 

is a member of some complete set of functions. Exceptions to this rule 
can apparently occur but are deemed sufficiently unlikely to be dis- 
regarded for the purposes of quantum mechanics. 

It follows from the theorem of p. 279 that if the representatives of two 
dynamical variables commute in any g' space, they commute in every 
legitimate coordinate space. In that case we say that the dynamical 
variables themselves commute. 

Any two multiplication operators in a given coordinate space, such 
as b*'X] and [gz'X] must' commute by the ordinary laws of algebra. 
Furthermore, if . the spectra of qh and.gz are both purely continuous, as in 

k The notation is taken over from classical dynamical theory (</. Dirac, P.O.M., 
2d ed., section 25). 



Sec. 371 COMMUTATION AND SIMULTANEOUS EIGENFUNCTIONS 283 


the case of ordinary positional coordinates, the conjugate operators 
(?) ^ h d _ , (i) __ h d 


Pk^ - 


2^1 dqk 


and pi"^^ = 


27rt dgi 


commute by the elementary rules of 


(g) (q) 

Finally (37-8) shows that pk"^^ and commute 


partial differentiation, 
if qk and qi are independent variables with continuous spectra in a legiti- 
mate coordinate space. 

If a is any dynamical variable and f(a) is any function of a, it follows 
that a commutes with f(a). This is a consequence of the fact that 
in a system of coordinates a/, ^ 2 ', • * • , of which a' is a member, the 

operators a and /(a) are multiplication operators. 

A more interesting case of commutation is that of two (or more) 
independent dynamical variables ai, ^2 whose representatives in Car- 

(x) (x) 

tesian space ai * and a 2 ^ have a complete orthonormal system of simul- 
taneous eigenfunctions. In the two-particle problem, for example, the 
three dynamical variables //, have simultaneous eigenfunctions 


Although no one of these operators is a fuiK^tion of either of the others, or 
conjugate to either of the others, they commute in pairs. To prove this 
we transform to a space in which the eigenvalues of the three operators 
are the coordinates. In this space all three are multiplication operators 
and must therefore commute. It follows that they commute also in the 
original Cartesian-coordinate space when applied to any quadratically 
integrable function. At the end of Sec. 38 it will be vitrified that the 
three operators //, £, commute in pairs with respect to all functions 

to which the product operators are applicable, whether the transforms are 
quadratically integrable or not. 

Generalizing the above revsult, we see that if ai, and are any two 
members of a mutually compatible set of dynamical variables whose 
eigenvalues form a legitimate coordinate system connected with the 
basic Cartesian system by means of a unitary transformation operator 

it is necessary that ai and a 2 nhall commute. In this case also 

. /a\ 

ai and at are multiplication operators which commute by the laws 
of ordinary algebra. 

The question now arises whether the commutability of two independ- 
ent dynamical variables ai and as is also a sufficient condition for the 
existence of a coordinate system of which they are members. This 
question, or its equivalent in one form or another, is usually answered 
in the affirmative, but the proposition has not been rigorously established. 
The ingenious argument due to Dirac‘ and intended to cover the general 

‘ P. A. M. Diaxc, P.Q.M., lat ed., section 17. 



284 


COMMUTATION RULES AND RELATED MATTERS [Chap. VIIT 


case is unsatisfactory from the standpoint of rigor. An alternative 
discussion by von Neumann^ is rigorous, but not so general in scope as 
might be desired. The following brief discussion is rigorous as far as it 
goes, but incomplete. 

Let a arid & denote the representatives in Cartesian-coordinate space of 
two commuting dynamical variables. Let £>« and Dp denote the class D 
adjoint manifolds of a and /3, respectively. Let ifnm be a normalized 
discrete eigenfunction of a with the eigenvalue an. We assume that an 
has a finite number, say gfn, of mutually orthogonal eigenfunctions so 
that the most general eigenfunction with the eigenvalue a„ is of the form 

On 

^ (Pnmdnm- It follows from the commutation rule that 
1 

~ “ an{^<Pnm) • 

Let us now introduce the additional assumption that the transform of 
every <pnm by & belongs both to I>« and to Dp. Then ^(pnm vanishes 
identically, or is an eigenfunction of ot. In the former case tpnm is a 
simultaneous eigenfunction of ^ and ot. In the latter case, if the eigen- 
value On is nondegenerate = 1), fitpnm must be a multiple of 
Since it belongs to Dp, tpnm is an eigenfunction of ^ as well as of a. If 
this is true of every eigenfunction of a, is a function of a and the original 
form a complete set of simultaneous eigenfunctions. 

In the more general case where some of the a„\s are degenerate, 
Ptpnm is necessarily a linear combination of the functions 

fiipnm = ^ipnk^{nk)nm). (37-11) 

;fc-i 

Here P(nk;nm) is an expansion coefficieiiL , The same type of expansion 
holds for every value of m from 1 to ^n- The coefficients p(nk;nm) form 
a square matrix \\^{nk;nm)\\ with rows and gn columns. If the func- 
tions <pnm of the original set happen to be all eigenfunctions of p as well 
as of a, the matrix will have the sinaple ‘^diagonar' form 

\\fi{nk;nm)\\ == \\fink5km\\ 

in which all elements vanish except those on the principal diagonal 
(c/. Sec. 43, p. 349). If the matrix is not diagonal, we can replace the 
by a new set of normalized orthogonal eigenfunctions of a, say 
^ni, • • • fi^non derived from the old by means of a linear transformation 

On 

^nm km* 

/f-1 

It is then always possible to choose this transformation so that the new 
* VON NunwANN, M,G.Q.. II, 10, pp. 88-93. 



Sec. 37] COMMUTATION AND SIMULTANEOUS EIGENFUNCTIONS 285 

functions form an orthonormal set of simultaneous eigenfunctions of 
both operators. The proof of this proposition is algebraic and will be 
supplied in connection with our general examination of the matrix form 
of the eigenvalue problem in Sec. 44c. If a has a discrete spectrum 
only, so that the ^^s form a complete set, the i/'^s will form a similar 
complete set. 

Fortunately the hypotheses on which the above proof of the existence 
of a complete system of simultaneous eigenfunctions of cn and ^ is based 
can be somewhat relaxed without loss in rigor. The technique involved 
in extending the theorem to the weaker hypotheses (due to von Neumann) 
is too (‘laborate for reproduction here and we content ourselves with a 
statement of the result. Let a and denote two Hermitian^ operators 
for which the eigenvalue-eigenfunction problem is solvable. Let Ma 
denote the linear manifold of eigenfunctions of a for the arbitrary eigen- 
value a. It suffices for the existence of a complete set of simultaneous 
eigenfunctions of a and ^ that (a) they commute, (b) one of the operators, 
say a, has a purely discrete spectrum; (c) independent of the value of a 
it is possible to approximate every function rp in Ma with any desired predion 
by an element of Ma which belongs to the Hermitian manifold of von 
Neumann does not give this theorem explicitly, but it is implicit in the 
discussion of commuting operators which he does give (M,G.Q,, II, 10). 
A generalization of the theorem to cases in w'hich neither operator has a 
purely discrete spectrum is much to be desired but presents serious math- 
ematical difficulties. 

It follows as a corollary on the above theorem that if a does not have 
a purely discrete spectrum but does have the discrete eigenvalue a, every 
function in Ma can be expanded into a series (or integral) of simultaneous 
eigenfunctions of a and jS. To verify the corollary one needs only to 
observe that it is always possible to define an operator y with two eigen- 
values 0 and 1, such that every function in Ma belongs to the eigenvalue 
1 and every quadratically integrable function orthogonal to the elements 
of Ma belongs to the eigenvalue 0. This operator is then a function of a 
which satisfies the above postulates (a), (b), (c). Since y and have a 
comiyete set of simultaneous eigenfunctions, every function in Ma can 
be expanded in terms of them. This proves the proposition. 

Let us now identify a with the Hamiltonian operator H, and jS with 
an arbitrary dynamical variable which commutes with //. The discrete 
eigenfunctions of H belong to class D and the adjoint manifold of jS 
includes class D. Hence (c) is satisfied for every discrete energj^ level 
Ek - a and it follows from the above corollary that every eigenfunction 

1 Since every dynamical-veriable operator gives rise to a Hermitian operator with 
the same set of eigenfunctions, and vice versa, the theorem must hold if ot and ^ are 
dynamical variables which are not necessarily Hermitian, provided that we substitute 
the phrase ‘‘adjoint manifold of jS'* for “Hermitian manifold of 



286 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


of H which belongs to Ek can be expanded into a series {or integral) of simul- 
taneous eigenfunctions of H and 

Experience goes beyond our proofs to show that ordinarily any two 
dynamical variables which commute satisfy the condition (c) and have a 
complete system of simultaneous eigenfunctions. Hence we shall 
designate any set of dynamical variables which have such a system of 
eigenfunctions as a normal commuting set. The variables which form 
such a set are mutually compatible in the sense of Sec. 36rf. There exists 
then a legitimate coordinate system of which any two normal commuting 

(“) 

dynamical variables on, a 2 are members, with a unitary operator 


(J) 


© 


which transforms and simultaneously to multipli- 

cative form. 

37d. Functions of a Normal Set of Commuting Dynamical Variables. 

Afunctionof such a set of commuting dynamical variables ai,a 2 , • * ‘ , ax 
is defined as a dynamical variable whose representative in a' space has 
the form 


/(«.,«*, • • • ,ax)® = • • • ,axOX]. 

In other words, it is an operator / such that every simultaneous eigen- 
function of the members of the commuting set is also an eigenfunction of 
/. It follows from this definition that if we can prepare an assemblage of 
physical systems with a unique set of values of ai, a 2 , • • • , ax it must 
necessarily have a unique value of /. If no one member of a set of normal 
commuting dynamical variables is a function of the others, the variables 
are said to be mutually independent. 

Following the argument of Sec. 36d, p. 250, it is easy to show 
that if some of the sets of simultaneous discrete eigenvalues of a normal 
group of commuting dynamical variables exhibit a finite degeneracy, 
it is always possible to remove that degeneracy by the addition of one 
or more independent commuting dynamical variables to the group. ^ 
Extending an hypothesis introduced in Sec. 36d, we postulate that all 
the degeneracy of the simultaneous eigenvalues of such a group, for both 
discrete and continuous spectra, can always be removed in this way. 
On the other hand, if none of the simultaneous eigenvalues of such a 
set of.normally commuting operators, say aj, ^ 2 , • • • ? ox, is degenerate, 
any operator which commutes normally with all members of the group 
must be a function of the a^s. This follows from the fact that there 
exists a complete system of simultaneous eigenfunctions of ai, • • • , ax, jS, 
each of which is determined to a constant factor by the eigenvalues of the 
a^s alone. Consequently every simultaneous eigenfunction of the a's 
must be a priori an eigenfunction of and every set of eigenvalues of 

P. A. M. Dikac, P.Q.ilf., 1st ed., pp. 46-46. 



Sec. 37] COMMUTATION AND SIMULTANEOUS EIGENFUNCTIONS 287 


the as must determine a unique corresponding eigenvalue of Hence 
a set of normally commuting dynamical variables is said to be complete 
if every one of its sets of simultaneous eigenvalues is nondegenerato. 
Clearly we can with advantage restrict ourselves to the use of coordinate 
systems based on complete sets of normally commuting independent dynami- 
cal variables. The number of operators in such a complete set is not a 
characteristic of the dynamical system under investigation — like the 
number of degrees of freedom in classical mechanics — but depends on the 
choice of operators. In fact we can in principle replace any complete 
set of normally commuting independent dynamical variables by a single 
dynamical variable without changing the system of eigenfunctions. To 
this end it is only necessary to pass a single line through all allowed points 
of a' space and to correlate points on this line with distance from its 
starting i)oint. This is possible in principle by a well-known mathe- 
matical theorem. 

Theorem:^ If the dynamical variables a and ^ form a normal commuting 
pair, a commutes with any function of and p with any function of a. 
The proof is an immediate consequence of the fact that a, p, and all 
functions of a, or of II, are represented by multiplication operators in 
any coordinate system of which a' and iS' are elements. 

Theorem: If every dynamical variable which commutes with the dynamical 
variable a commutes also with a second dynamical variable it follows that /3 
is a function of a. 

Let y be an arbitrary dynamical variable which commutes with a 

and hence with / 3 . We shall consider the representatives of a, /S, and y 

in a space ai,a2j • • • , in which one coordinate, say a/, gives the 

eigenvalues of a. The theorem then reduces to the statement that, 

(«) («) 
under the conditions stated, ^ has the form [ic(ai')X]. Let 7 be 

any multiplication ojjerator, say [ip{ai\ai, • • • )X] and let u{a!) 

(“) . 

denote an arbitrary probability amplitude. As 7 is a multiplication 

(«) (“) 

operator, it commutes with a and hence with ^ as well. Thus 

/a\ /a'\ 

(pu = It follows that for arbitrary choices of the functions 

u and (p and at points where neither vanishes 


Ipu u 


— w{ai,ai, • • • )• 


This is sufficient to establish the multiplicative character of the operator 

It remains to show that the function w(a') is actually independent 
of all the a'’s except ai'. This can be done if we can find a set of non- 
multiplicative operators in o' space which commute with the operation 

» Cf. Dibac, op. eU., p- 41. 



288 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


of multipl3dng by ai but not with any of the other operators [a2'X], 
[azX], • • • . If ajfc, for example, has a purely continuous spectrum 
ranging from — oo to + oo, the operator id /dak represents a dynamical 

variable in a' space, commutes with [ai'X] but not with [oja^'X]. By 

(a) 

hypothesis this operator must commute with jS , which means that 

d d 

-r -,WU = w—-,u 


for every differentiable u. Thus dw/dak is identically equal to zero, and 
w is independent of a*'. The^more general case where the spectrum of ak 
is partially continuous and partially discrete can be dealt with by means 
of suitably defined exchange operators. It will suffice to consider two 
points on the discrete* spectrum of ak which we label ak and 
To prove that w has the same value at any pair of corresponding 
points Pi = (aiya^y • • • • • • )y Pz = (a:i',a:2', • • • * * * ) 

we identify 7^“"^ with the operator that interchanges the values of u(a) 
at every such pair of points, while leaving u(a') unchanged everywhere 
else. The operator is readily seen to fulfill the requirements for a 
dynamical variable and (commutes with [a/X]. It can commute with 
[wX]y however, only if w has the same value at every pair of corresponding 
points. As the values of ak and ak' are arbitrary, it follows that w is 
independent of ak over the entire discrete spectrum. Similarly we show 
that it is independent of ak over the entire continuous spectrum. To 
demonstrate that w has the same value for a pair of corresponding points 

(«) 

Pi, P2 in this spectrum, a more general type of Hermitian operator 7 is 
needed; we omit the discussion of this final step in the proof. 

Obviously we can replace the single dynamical variable a by a normal 
set of commuting dynamical variables in each of tlie foregoing theorems. 


38. THE CONSERVATION LAWS 

38a. Conservation of Energy. — In the classical mechanics the energy 
and the Hamiltonian function P(p,g) of a conservative system are 
constant in time for any natural motion. In quantum mechanics we 
have, in general, no single definite energy but a distribution function 
giving the probability of various energy levels. By “conservation of 
energy’’ we may therefore imply the constancy of this distribution func- 
tion in time, or, what comes to the same thing, the constancy of the mean 
values of H and its various powers. If the Hamiltonian operator 



does not depend upon the time explicitly— as hitherto assumed — 


conservation is readily proved with the aid of the expansion (35T) applied 
to a class D solution of the second SchrOdinger equation (7*6). It follows 
dfrectly from the method of calculating energy probabilities described in 



THE CONSERVATION LAWS 


Sec. 38J 


289 


Sec. 35a [c/. especially Eq. (35*2)] that the distribution function which 
gives the probability of different energy values is independent of t. 

38b. Variation of Energy When the Hamiltonian Depends on the 
Time. — In the case of a system for which the classical Hamiltonian 
function and the corresponding operator depend explicitly on the time, 
the energy is not conserved classically and the eigenfunctions of the energy 
operator do not yield monochromatic solutions of the second Schrodinger 
equation. It is still possibles, however, to define an energy distribution 
by expanding ^ into a linear combination of eigenfunctions of the 
equation 



in which t is treated as a parameter. This amounts to assigning to E 
at each instant the energy spectrum and distribution function which 
would be appropriate if the oi)erator // could }>e frozen so that its instan- 
taneous form became permanent. The coefficients in this expansion 
will no longer b(^ simple exponential functions of the time, however, and 
hence the distribution fuiu^tion defined by (36-68) and (36-77) will 
vary with the time. The* average energy will be given by 

£ = I* (38-2) 

and is obviously dependent on t. Taking into account the Hermitian 
character of H we readily derive 


(m 

dt 


* -ffi-'J* ■ J.’^ S' - " 


"dl 


dr. 


In the case of an atom or molecule, subject to the influence of an 
external classical radiation field (c/. Sec. 7), we can write the Hamiltonian 
as the sum of two terms, one of which represents the energy in the absence 
of the field while the other varies with t and gives the contribution of the 
field to the motion. The second, or mutual, energy term is ordinarily 
assumed to be small. To measure the nonmutual energy at any instant 
t it would be necjessary to isolate the system by removing the radiation 
field and then to determine the energy by a method appropriate to the 
isolated or unperturbed system. This experimental procedure cor- 
responds in theory to the use of an energy spectrum and distribution 
function obtained by analyzing the instantaneous ^ function into a 
linear combination of eigenfunctions of the equation 





(38-3) 



290 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


where //o the constant part of the Hamiltonian operator.^ The varia- 
tion in the distribution function thus defined (c/. Sec. 54) is commonly 
interpreted as diie to ^^piantum jumps^' from one energy level to the 
other caused by the radiation. 

S8c. Conservation of an Arbitrary Djmamical Variable. — Consider 
next the conservation problem for an arbitrary dynamical variable a. 
We assume that the energy itself is conserved and that a does not depend 
explicitly upon the time so that it commutes with the operator d/dt. 
Under these conditions a sxifficient condition for the conservation of a is that 
H and a form a normal commuting pair of dynamical variables, 

(/?) 

To prove this theorem let T denotes the unitary operator that 
transforms probability amplitudes in Cartesian coordinate space into 
probability amplitudes based on a coordinate system iS 2 , * • * of 
which a is a member. The distribution function giving the probability 
of different eigenvalues of a for the Cartesian wave function is 

Q(“') = 

where ^ ^ denotes a sum integration over all the /3' coordinates l?xcept 

a. As H and a form a normal commuting pair, we can include H among 
the dynamical variables /3i, /32, * * * . Then, since a solution 

of the second Schrodinger equation, it follows from Sec. 376 that 


= (T- 


-)®e 




— - (^) 
h 'TV*/ 




Consequently 


dQ(a') ^ dS^' - 
^'dl dt^ r 


2riEt /p\ 




as was to be proved. 

An interesting alternative proof is obtained if we note that in case 
the distribution function for a is not constant in the neighborhood of 
a = a" it must be possible to choose a function /(«') which vanishes 

outside the immediate neighborhood of a' == a" and such that 

does not vanish. Since a and H are assumed to form a normal com- 
muting pair,/(Q:)^ will belong to the Hermitian domain of H and we have 


= - fH)'9dr = 0. 

Thus we get a contradiction which proves the theorem. 
* Cf I^EMBLE and Hill, Rev, Mod, Phys, 2, 6 (1930), 


(38-6) 



Sbc. 38] 


THE CONSERVATION LAWS 


291 


In classical mechanics dynamical variables which remain constant 
during the natural motion of the system are called integrals of the equaiions 
of motion or, simply, integrals of the motion. The analogue of such a 
classical ‘‘integrar^ in quantum mechanics is the dynamical variable 
which unites with the Hamiltonian H to form a normal commuting pair 
and whose distribution function therefore remains constant in time when 

computed from a solution of the Schrodinger equation 

Awl at 

We shall therefore refer to such dynamical variable's as integrals of the 
motion or integrals of the Schrodinger equation. Since an arbitrary solution 
of the Schrodinger equation for the hydrogenic-atom problem can be 
developed in terms of simultaneous eigenfunctions of H, and Sty, 
or Stg, each of these operators is an integral of the motion for such a 
system. In Sec. SSd it will be proved that these operators are integrals 
of the motion for any free atomic or molecular system. 

SSd. Commutation Properties of the Hamiltonian and the Angular 
Momentum. — ^Let us now briehy consider the commutation properties 
of the five operators just mentioned in the general case of an n-particle 
problem, starting from the definitions 


Tk = ixk + jVk + kZk, 


(38-7) 
gradit, (38*8) 


Vk ^ ih + ivk + kh- 2^i( ' Lk 2^' 

» - + + F(x,, ■ . A) - 


IB =■ X 

J* = (Vktk Zktlk) 


d d) 


^ (38-10) 




d ay 

^'‘dXk ^'‘dZkJ 


X (38-11) 


£. — ipl'kVk Pk^l^ 


^^dXk) 


= ( 38 - 12 ) 




( 38 - 13 ) 



292 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 

We note finst of all that e^ach of the elementary operators Xi, • • • 

• • • > fn commutes with every memlx^r of the set except its own con- 
jugate. Then, by (38T0), 

(6® + — £>x{ik^ + V + 

= (vk^ + h^)(yktk — ZkVk) — ivk^k — Zkrik){vk^ + 

~ ^k(j}k^yk ykVk“^ VkiXk^Zk ^k^k^^ 

= tk(^Vkykrjk + ykiik^ —Vk(^^kZktk + Zk^k^ 

= = 0 . 

Hence 

[£,x,H] = [£x,F] = gradt]*F - y[r* X gradt]*} 

k 

= -%[rk X (grad* F)]^. (38T4) 

k 

The right-hand member of the above equation is an operator which stands 
for multiplication by a function of the coordinates, viz., by h/2iri times 
the X component of the classical torque applied to the system in the 
configuration Xi, • * • , Zn* The Poisson brackets of H with and JC* 
are obtained by permuting the letters cyclically. If the external forces 
applied to the system have spherical symmetry, the classical applied 
torque will always he zero and commute with H. This state- 

ment applies in particular to a group of electrons moving about a fixed, 
positively charged nucleus. If the force field has axial symmetry, as in 
the case of an atom subject to a uniform electric or magnetic field, the 
component of angular momentum parallel to the axis of symmetry 
commutes with H, but the other components do not. If any of thane 
operators commutes with a Hamiltonian H the sufficient conditions for 
a normal commuting pair specified on p. 285 are clearly fulfilled if we 
idemtify a with the operator in question and ^ with H. Hence each of 
these operators is an integral of the motion when it commutes with H. 

Clearly, if £>y, £>z all commute with H, must also commute with 
H. Hence £>^ is an jntegral for systems of spherical symmetry, and’ for 
such cases only. 

Finally we note that 

(yktk — Zk'nk){Zkh — i^k^k — Xk^k)(yk^k — ZkVk) 

h 

yk^k(.tk^k "* “ Xk^k(Xk^k ^kVk')> 


Hence 

[£.,£*]=£.• (38-15) 



Sec. 39] CONJUGATE DYNAMICAL VARIABLES 

Advancing the letters cyclically we obtain 


293 




(38-16) 

(38-17) 


These commutation relations can be thrown into the vector form 


ii X £ 


(38-18) 


if due attention is paid to the order of the factors in the various terms of 

— » -4 — > 

each component of the vector product £ X The operator can be 
h ^ 

written in the form^^.rfc X grad/k and transforms like a three-dimensional 

vector. Hence both sides of (38*18) transform like vectors and make this 
equation invariant of a rotation of the axes. 

39. CONJUGATE DYNAMICAL VARIABLES AND QUANTUM-MECHANICAL 
EQUATIONS OF MOTION 

39a. Conjugate Dynamical Variables. — In classical theory the com- 
ponent of momentum canoni(*ally conjugate to the generalized positional 
coordinate gk is the quantity 




(391) 


where L is the LagraJigian function of the coordinates and their velocities 
(c/. footnote 1, p. 23). If the forces are derivable from a potential 
function the components of linear momentum (canonically conjugate to 
the Cartesian coordinates Xk^ yu, Zk are 

^•k ~ Vk ~ ykilkj ^k ^ ykZkf 

respectively. In the more general case of an electrified particle moving 
in an external electromagnetic field (c/. Sec. 7d) the components of linear 
momentum defined by (39-1) have the form indicated by the vector 
equation (16-24). 

The introduction of the momenta pk along with the coordinates gk 
permits the reduction of the classical equations of motion to. the first-order 
canonical Hamiltonian form 


dgk ^ dH{pyg,t) 
dt dpk 


dH{p,g,t) 

dgk 


(39-2) 


This form is valid, however, not only for an arbitrary set of positional 
coordinates gk with their corresponding momenta but also if we express 
the Hamiltonian function in terms of any set of variables 

Qk ~ Qk(.gk)pk}t) f Pk ^ PkigkjPkti) 



294 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


of a wider class derived from any initial set of p's and g's by means of a 
contact transformation which involves both the coordinates and the 
momenta. Hence the term canonically conjugate" is applied in classi- 
cal mechanics to any pair of variables Qa?, Pu of a double set which permits 
the reduction of the equations of motion to the form 

dQk __ dPk __ OH 

dt dPk dt dQk 

In defining conjugate dynamical variables in quantum mechanics 
one can seek pairs of operators associated with known pairs of conjugate 
classical variables, or one can make direct use of the quantum-mechanical 
analogue of the Hamiltonian equations. Starting from the former stand- 
point we let Xk and pk denote, respectively, a classical Cartesian coordinate 
and its conjugate momentum. The corresponding dynamical variables in 
quantum mechanics are defined by the representative operators 

X,® = [xu'X], PrP = (39-3) 


As another example, consider the z component of angular momentum 
which is clasvsically conjugate to the azimuthal coordinate ai == in a 
system of coordinates rjb, 6k, au derived from an ordinary spherical 
system r*, 6k, <pk by the transfoi^mation 


ai = (pi, ak = ipk — <pi- = 2, 3, • • * . 


We have already identified the quantum variable with the class of 
operators whose representative in Cartesian coordinates is 



Furthermore it was proved in Sec. 34d that a ‘‘direct transformation" of 

C*} ht d 

to the rk,6k,cxk system of coordinates yields the operator h— . - — .• 

ZTi aai 

The direct transformation, however, did not allow for a renormalization 
of the probability amplitudes in the new coordinate system and we 
accordingly make use of the basic formula (36*6) to get the correct 
representative of £,e in the rk,6k,otk system, or the q* system, as we shall 
hereafter call it. 




(?) 




(I) 


(JT-,) 


(P 


(39 4) 


(«) 

H«re T * denotes the operator which transforms a probability amplitude 
(x'l) mto the corresponding probability amplitude (g''|). Since this is 
a tranaiformation from one system of positional coor^nates to another 



Sec. Sy) 


CONJUGATE DYNAMICAL VARIABLES 


295 


(®) 

(c/. p. 239), T * reduces to the product of the square root of the Jacobian 
D{q') 

(9) 

into the operator 0 * which denotes the substitution for each Cartesian 
coordinate of its value in terms of the coordinates. Similarly (y- 


d(ri, Oi, • ■ 

• . «/') 

yi, ■ ■ 

• , 2 /') 


(x\ 

can be written in the form 0 ® D(g') 


Thus 


( 9 ) (^) (^) (^) 

(39-5) 

f*) (X) (x) 

The operator 0 * £2 * 0 m ihiulireci transform oi Jilg in the 7 ' coordinate 
system as derived by the ordinary methods of partial differentiation. 

The Jacobian Dfe') turns out to be the product Wjr/ sin Bk. Thus, by 


(34-13), 


(?) ^ 


sin e,'J± —{Hr/’ «in 0 / 


h d 

2iri dai 


(39-6) 


The parallelism between (39-3) and (39-6) suggests that in all cases 
where Qk is a positional coordinate we can correlate quantum-mechanical 
conjugates with classical conjugates by defining the momentum quantum- 
mechanically conjugate to 7 ^ by the formula 



h d 
2Ti dqk 


(39-7) 


By so doing we maintain the commutation rule (37-8) for all such pairs 
of dynamical variables. 

As a check on this procedure we must verify, if possible, that the 

(q\ 

operator pk ^ so defined will always satisfy the restrictions imposed on 
dynamical variables. First of all comes the question of its Hermitian 
character. 

Let the spectrum of qu (ixtend from qk = a to qk = h. Clearly 


( 


h 

,2« dqk ’ 

= Jl 


■J 


’ 2in dqk'/ 


dqk-idqkW 


dq/. (39-8) 


h d 

It follows that the operator ^ is Hermitian with respect to the class 

of physically admissible functions f , provided that the scalar products 
on the left are convergent for every pair of functions belonging 



296 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


to this class and provided that every such function takes on the same 
values at the end points a and h of the spectrum of qk- 
Consider the domains 

qk{xi, • • * , Xf) = a, qk(xij * • • , x/) = h 


in Cartesian or X space. Since all X space is mapped on that portion of 
q space between qk = a and qk = these boundary domains of q space 
must either map coincident hypersurfaces in X space or else they must 
in some sense map boundary domains in Cartesian space. 

Consider, for example, the spherical coordinates r, 6, ip. The angle ip 
ranges from z(*ro to and the surfaces (^ = 0 and = 27r are coincident 
half planes in X space. The angle B ranges from zero to t and the 
domains ^ = 0 and ^ = tt of r^B^ip space map the two halves of the 
z axis in X space. As the z axis is a line and not a surface, w(' can regard 
it as a boundary in A" space without thereby removing any volume. 
The radius r ranges from zero to infinity, its boundary values marking 
spheres of zero and infinite radius, respectively, in X space. 

If the domains qu = <?// = & are eoinckhnit hypersurfaces in X 

space, vanishes automatically because is single- 

valued. If these domains are not coincident hyporsurfaces, will 
ordinarily vanish on both of them. There are two cases to be considered. 
The first is that in which qk = a corresponds to infinite values of one 
or more of the Cartcssian coordinates. If is a class D function vanish- 
ing exponentially at infinity, we can assume without appreciable loss of 
generality that vanishes at infinity or that lim \f/q = 0 . The second 

qk'—*a 

case is that, typified by the limits 0 = 0 and 0 = tt above, in which the 
domain qk == a maps a degenerate domain in X spac^e having dimensions 
less than / — 1 , / being the number of dimensions in X space. Under 
these circumstances an infinitesimal volume of X space in the neighbor- 
hood of the domain in question will be mapped on a relatively very 
large volume of q space, so that the Jacobian D must vanish at qk = a. 

Thus the right-hand member of (39-8) wdll practically alw^ays vanish 
The integrals on the left will usually converge and we conclude that if qk 
is a positional coordinate obtained from the Cartesian coordinates by an 


ordinary point transformation, 


is normally Hermitian with 


2wi dqk 

respect to physically admissible functions in q space, with respect to 
the linear manifold of functions 1 >^«> obtained by applying the operator 

to the functions of class D. 


It does not follow, however, that - 5—7 is the representative in the 

dqk 

q coordinate system of a satisfactory real dsmamical variable p*. For an 
example of a classical positional coordinate which has no satisfactory 



Sec. 39] 


CONJUGATE DYNAMICAL VARIABLES 


297 


quantum-mechanical conjugate we need go no farther than the radius r 
in a three-dimensional spherical coordinate system. The operator 
h d 

2 ^ ^ is easily seen to be Hermitian with respect to physically admissible 


functions of r', S', as these functions all vanish at the origin and at 
infinity. Solutions of the equation 


h d\l/ 
2Tri dr' 


Prh 


however, do not lend themselves to the formation of the desired complete 

system of eigenfunctions. They all have the form ^ 

and none of them is quadratically integrable. The eigenvalue spectrum, 

if any, must be continuous. The eigendifferentials 




p/A-fj 2iri 


ue ‘ 


(1/ 


dpr' 


h r 


1). 


2irt 


V'pr' 


u 


are quadratically integrable but do not vanish at the origin and do not 

h d 

belong to a linear manifold with respect to which is Hermitian. 

h d 

Hence the operator , does not define a type 1 operator. As it is 

not a multiplication operator we can reasonably infer that it does not 
define a true dynamical variable in the quantum-mechanical sense. 

It is of interest to note, however, that this operator does share several of the 
important properties of true dynamical variables — perhaps we might call it a quasi” 
mriahle. For example, we can speak of the “mean value” of pr for an assemblage 
of a state described by the probability amplitude if we define the 

mean by 


Wr = 


///’"-■sip"*’'"'*’' 


(39-9) 


This mean value is real, and its time derivative is given by (38-6). To be specific, 
we note that 


Pr 


ii) 




p-n Vr d<P 


;) h 


a J'Jt) 


dr‘ 


.J.V 


= e 




VI h d ,1 

[r' 2wi dr'"* J' 


(r e*F\ 


Hence, if we use wave functions normalized in Cartesian coordinates, Eq. (38*6) 
becomes in this case 


dl 


Iff. 




r(s)l a. 


r' dr' 


2^. 

r* 


W 

dr* 




(3910) 


Although the definition (39-7) does not always yield a conjugate which 
is a true dynamical variable, it is the accepted definition for all cases in 



2d8 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


which the coordinate qk has a purely continuous spectrum. The reader 
will note that, since qk does not uniquely determine the q coordinate 
system to which .(39*7) directly refers, it is in general possible to use that 
equation to define several different quantities p*, each of which is the 
momentum conjugate to qk in an appropriate coordinate system. 

Let us now assume that the real dynamical variable qi, which in 
conjunction with g' 2 , • • * forms a normal complete set of commuting 
variables, has a conjugate momentum pi defined by (39*7) which is a 
true dynamical variable. The eigenfunctions of pi in q' space must be 
of the form 

2x1 , , 

( 91 ', • • • , 9 x'|p.', • • ■ ) ■ ■ • ,9x0. (39-11) 

Simultaneous eigenfunctions of pi, ^ 2 , * • * , have the form 

( 9 /, • • • , 9x'1pi', 92 ", • • • , 9x'0 

2x1 , , 

= Ae ^ - 92")3(<73' - ga") • • • 6(gx' - gx"), (39*12) 

A denoting a normalization factor. They define a transformation 

operator T ^ which carries probability amplitudes in g' space over into 
corresponding probability amplitudcxs in a space in which the coordinates 
are 


oL\ = pi, CL2 = g2, «3 = ga, • ' * , ax = gx. 

(«) 

The operator T ^ has the specific form 

- - 2in , , 

= X/® - 92 ") • • • fi(9x' - 9x'0(9'l). (39-13) 

With the aid of we can replace (39-7) by 

Equations (39*11), (39*13), and (39*14) afford a generalization of the 
definition of the phrase momentum quantum-mechanically conjugate 
to the coordinate qk* to real dynamical variables qk which are not posi- 
tional coordinates and which do not have purely continuous spectra. 
An immediate consequence of this generalized definition is the rule that 
if is the momentum quantum-mechanically conjugate to a, —a is 
the momentum quantum-mechanically conjugate to Thus the 
angular momentum is the momentum quantum-mechanically con- 
jugate to the angle ip in the coordinate system of Sec. 34d, and conversely 
— ^ is the ‘‘momentum^’ quantum-mechanically conjugate to the 



Sec. 39] CONJUGATE DYNAMICAL VARIABLES 299 

''coordinate'' £,» in the coordinate system £„ Vky Bk, ak. This rule forms 
an exact parallel of the corresponding rule for classical mechanics. 

Unfortunately, however, it has not yet proved possible to derive the 
commutation rule (37‘8) except on the basis of the definition (39*7). 
As this commutation rule is the usual basis of the applications of the idea 
of canonically conjugate variables in quantum mechanics, we can, for 
practical purposes, limit the conc^eption to cases where the coordinate 
q has a ^purely continuous spectrum and its conjugate momentum p 
is defined by (39-7). 

A still more drastic restriction of the class of canonical variables in quantum 
mechani(;s has been suggested by Dirac, ^ who purports to show that the relation 
[p,(/| = I implies that p and q have continuous spectra extending from — <» to + 00 . 
The appearance of this statement® in the literature is rather mystifying in view of its 
immediate contradiction by the familiar and elementary example of Eq. (39 (5). 
The angular momentum Zs has a purely discrete spe(!trum and the coordinate ip 
which defines the absolute azimuth of a system of particles with respect to the z axis 
has a contimious spectrum ranging from zero to "It (according to the usual convention 

regarding the principal values), and yet 7^. ~ 1 =1. 


An examination of Dirac’s reasoning affords an illustration of the dangers of 
attempting to deal with quantum mechanics on a too formal and algebraic basis. 
As a corollary on (397) (dropping the unessential subscript k) we have the relation 




(3916) 


In particular, it follows that for any number c, 


(v) . , . , (^) 


(3916) 


Let us now apply this relation to an eigenfunction of p in q' space, which we designate 
as 

= (p' +^) (39-17) 

n 

From this equation Dirac draws the conclusion that we have only to multiply \f^p' by 
where c is now restricted to real values, in order to obtain an eigenfunction of p 
ch 

with the eigenvalue p' + k-- In other words, every real number is an eigenvalue of 
Zir 

p\ Complex eigenvalues are excluded on the ground that, if c contains an imaginary 
part, is not a physically admissible” operator. 

The argument breaks down, however, for real as well as imaginary values of c 
because it takes no account of the requirement that the eigenfunctions of p must 
belong to the Hermitian manifold of p, f.e., to Dp. Identifying p with the angular 
momentum £» and q with the angle <p, the domain Dp is defined by the requirement 

» P. A. M. Dirac, P.Q.M., 1st ed., p. 54; 2d. ed., p. 94. 

® Dirac's conclusion is apparently correct if the commutation rule [p,q] = 1 is 
assumed to be a valid matrix relation (<J. p. 367), as well as an operator relation. 

\ 



300 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


that ^(^4* 27r) = ^(^). As we proved in Sec. 34 this boundary condition gives 
eigenfunctions of the form \l/m - where m is an integer and = mh /2 t. 

Evidently does not belong to Dp unless c is an integer. 

89b. Functions of Non-commuting Linear Operators. — The operator 
algebra of Sec. 37 permits us to define simple functions of non-commuting 
sets of linear operators, or dynamical variables, as well as of normal 
commuting sets. Thus the equation + £>y'^ + £>z^ defines 

as a function of the elementary operators £x, Similarly, 


£.z = xpy — yp^ 


defines <£« as a function of the operators x, ?/, px, py. Such a function 
• * • ) of non-commuting operators has in general very different 
properties, however, from the previously defined functions of sets of 
normal commuting dynamical variables. As the ‘^argument operators” 
a, jS, • • * have no simultaneous eigenfunctions — at least not a complete 
set — ^there can be no general simple relation between the eigenfunctions 
of F and those of its arguments. 

The concept of differentiation is applicable to functions of operators 
and can be included among the analytical processes available for building 
up such functions. Let I denote the identical operator which carries 
any function over into itself. The partial derivative of F(ajfiy * * • ) 
with respect to a is then defined by the equation 


— = lim “*■ — ’ ' • • ) ~ P(°‘y • • • ) 
da i "o L a 


(3918) 


It is readily proved that on the basis of this definition all the ordinary 
rules of differentiation apply to the differentiation of operator functions 
except that one must have due regard to the order of the factors in dealing 
with products. Thus 


d{F + G) ^ ^ ^ 

da da da ^ 


d(FG) 

da 


dF dG 





(3919) 

(39*20) 

(39*21) 


If the reciprocal of an operator a exists, Eq. (39*20) shows that 


— (a”a ”) = -r~-a ^ -f- a^' 
da da 


da 


= 0 . 


It follows at once that (39*21) holds for negative values of n as well as for 
positive values. 

The process of differentiation takes on a particularly simple form when 
we have to do with a function of one or more pairs of canonically con- 



Sec. 39] 


CONJUGATE DYNAMICAL VARIABLES 


301 


jugate operators. It makes no difference whether these operators are 
true dynamical variables or quasi-variables like the radial momentum 
Pr. If F{p^q) denotes a function of the canonically conjugate pairs of 
operators gi,pi, ^ 2 ^ 2 , * * • ,g/,p/ built up from its arguments by addition, 
multiplication, and division, the partial derivatives of B" with respect to 
these arguments are 

- Fv,) = (39-22) 

£ = 

To prove this statement we note first that the commutation rule 

[(IhVk] = -{PkM == hi 

implies that (39-22) and (39-23) hold when we identify F with any one 
of the 2/ basic operators qij * • • ,p/. But if the theorem be true for 
any two functions F and G it holds also for their sum and product. Thus, 

|.(re) . + fg . - fwff + F(p^ - Gp.)i 

= [FG,p,]. 

The rules hold true even for negative powers of the g^s if they arc mul- 
tiplicative operators representing positional coordinates. Thus by 
induction the rules (39-22) and (39-23) hold for any function F which 
can be built up legitimately by the operations of addition, multiplication, 
and division. 

39c. An Operator Form of Hamilton’s Equations of Motion. — 

Equation (38-6) suggests the possibility of defining an operator da/dt 
when a does not contain the time by 

^ = W,H], (39-24) 

thus insuring that when ^ is a solution of the second Schrodinger equation 

§ - . §■ (39-25) 

Let us assume that the operator a is Hermitian with respect to the linear 
manifold D of physically admissible wave functions and that the trans- 
form of any function in D by a belongs to the Hermitian manifold of H. 
Then, from the relation 

(la,ff]if'i,t/' 2 ) - ^{(00/^1,11^2) - 

it follows that [a.H] is Hermitian with respect to D. 



302 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


If H is now assumed to be given as a function of a set of generalized 
positional coordinates and their conjugate momenta, and if we make Use 
of the partial derivative formulas (39-22) and (39*23), we obtain, as a 
formal parallel to the classical equations of Hamilton, 


dpk _ dH{p,q) ^ dlf(p,q) • 

dt dqk ^ di dpk 


These formulas are largely definition, but, when used in connection with 
Eq. (39*25), form the basis for a quantum-mechanical derivation of the 
basic Hamiltonian equations of classical mechanics. 

To this end wc apply the equations of motion to a sharply defined 
wave packet, i.e.j to a wave function representing a pure case assemblage^ 
of heavy systems so prepared that the uncertainty of each of the coordi- 
nates and momenta is very small compared with the absolute value of 
the quantity in question. If a denotes a coordinate with a continuous 
range of eigenvalues 

aAf = ajc{a)"^a'da' = /c(a')a'^cr^dof'. 


Since the packet is sharply defined, c(a') is appreciably different from 
zero only in a very narrow interval enclosing the average value <5. 
Hence we have to a close approximation 


= oP J c{a')^a>da^ = 


(39*27) 


Thus ^ becomes an approximate eigenfunction for all the p\s and g's, 
and for any function of them. All the coordinates and momenta com- 
mute to this approximation and the average value of any function of the 
coordinates and momenta becomes equal to the same function of the 
average values of its arguments. In particular ti{p^q) goes over: into 
H{p,q) and SH{p,q)/dqk goes over into dHipS/W^- Equations (39*25) 
and (39*26) now yield 


dt J dqk dqk 

00 

dg* ^ dH(p,q) 
dt dpk 


(39*28) 


Identifying the classical values of the p's and g's with the rnean values for 
the packet, we obtain the classical canonical equations of Hamilton and 
thereby justify our quantum-mechanical definition of the momentum 
operator conjugate to the generalized positional coordinate qk* Of 

h d ' 

course, we have not proved that is the only Hermitian. operator 

whose mean value in the limiting case of a sharply defined wave packet 
^ Our results apply also to suitably defined mixtures. 



Sec. 40 ] SYMMETRY PROPERTIES OF WAVE EQUATION 303 

will go over into the classical momentum conjugate to qt. Whether 
other operators of this type exist or not is of no great importance. 


40. SYMMETRY PROPERTIES OF THE WAVE EQUATION 


40a. S 3 rmmetiy Properties in General. — The conservation of angular 
momentum is one of several important symmetry properties of a free 
atomic system or of its Hamiltonian operator. These symmetry proper- 
ties are of the greatest practical importance as they are responsible for 
most of the degeneracy of the energy levels, for the classification of the 
levels into non-combining term systems, etc. 

In order to define the symmetry properties we assume a system 
of coordinates a;i, ^ 2 , * * * , and consider th,e substitution or trans- 
formation 

Xi-^Xi = ipi(Xu • • • , X/),\ 

X2 * Xo ~ <P2{X\y * * * , Xf)J 


Xf-^Xf iPf(Xi, • • • , Xf).) 

Let R denote the operation of replacing by iJi or (pi(x)j X 2 by £2 or 
iP 2 (x)f • * • . Thus 

••*,%)= fl<Pi(^)y Mx), • • • , iPf(x)] s f{x). ( 40 - 2 ) 


Here/(x) is simply the new function of x obtained by letting R act onf(x). 
Let us now define the operator 3 by the equation 

RH^|^{x) = 3^(x) = 3l^. ( 40 - 3 ) 

We may say that 3 is the operator into which H is carried by the sub- 
stitution R. If 3 is the same as the original operator H we say that H is 
invariant under the substitution R, The property of being invariant 
with respect to the substitution JK is a symmetry property of H. It 
means that H commutes with R. If R has an adjoint manifold which 
includes D and a complete set of eigenfunctions it is a proper dynamical 
variable which is conserved during the natural motion of the system. 
In other words R is an integral of the motion described by the Schrodinger 
wave equation. 

As a simple example we consider the one-dimensional case where 


H 


J}— 

dx^ 


+ V{x). 


Then, if R is any substitution (40*2), 

RH4> = + n<p{xm<p{x)], 


or 


" ~ SirVLVd^/ ^ <ixj ^ ^ ^ 



304 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


In order that ti shall be identical with H it is necessary and sufficient that 
<P has the special form 

tp{x) = a ± a:, 

where a is a constant, and that 

V = F(a.± x) = V{x). 

If V is not constant, it must either be periodic with period a or it must be 
symmetrical about the point x ^ 

Let us consider the latter case, of which the Planck linear oscillator 
is a special example. Taking the point of symmetry as the origin, the 
transformation R becomes the ope^ration of replacing x by —or, or reversing 
the direction of the x axis. We designate this particular transformation 
or operator by the symbol U. U is Hermitian, since 

^[u*(x)Uv(x) — v(x)Uu*(^)]dx 

= [u*(x)v( — x) —v(x)u*( — x)]dx = 0. (40-4) 

The integral vanishes because the integrand is an odd function of x, i.e., 
a function which has opposite values at x and —x. Since = 1, and 
U = W, U belongs to the class of unitary operators defined in Sec. 36c. 

An operator such as U which is both Hermitian and unitary can have 
but two eigenvalues ± 1, for, since = 1, each eigenvalue must be a 
real square root of unity. We shall use the term symmetry values for the 
eigenvalues of these operators. Eigenfunctions of any of them are said 
to be ^^symmetric'^ or antisymmetric'^ with respect, to the operator in 
question according as the symmetry value is +1 or —1. 

An even fumition of x is by definition a solution of the equation ^ 

and, if quadratically integrable, can be regarded as an eigenfunction of U 
with the eigenvalue +1. Similarly a quadratically integrable odd 
function of x is an eigenfunction of IJ with the eigenvalue —1. An 
arbitrary quadratically integrable function f{x) can always be resolved 
into the sum of two such eigenfunctions. To do so one has only to form 
the functions 

= VzU + vf), /<-> = KCf - m (40-5) 

Then 

[//(+) /(+), {//(-) = 


Clearly U is a real dynamical variable. 


(40-6) 

(40*7) 



Sec. 40| SYMMETRY PROPERTIES OF WAVE EQUATION 305 

Not only do [/ and If commute. They also satisfy the conditions (b) 
and (c) of Sec. 37c (p. 285) needed to establish the fact that they form a 
normal commuting pair. Thus [/ has a purely dis(*rete spectrum and it is 
possible to approximate any of its eigenfunctions with arbitrary precision 
by means of another eigenfunction of the same eigenvalue which belongs 
to the Hermitian domain of If. Hence every eigenfunction of II must 
be expansible in terms of simultaneous eigenfunctions of [/ and H. 
Since the one-dimensional energy equation is nondegenerate, this means 
that every eigenfunction of H is also an eigenfunction of IJ. In other 
words, the eigenfunctions of H are either even or odd functions of x. 
The operator U is thus a function of H in the sense of the definition 
on p. 281. 

40b. The Reflection Operators. — In the tlieory of atomic and molecu- 
lar structure we meet with a number of operators similar in their proper- 
ties to U, (Consider the substitution 

Xi -^Xi -= -Xi, Vi Vi = Viy Zi ”> Zi = 2:^, (40*8) 

where i ranges from 1 to / in a problem involving / particles. This 
transformation c.arries each particle over into a position previously 
occupied by its mirror image in the y,z plane. Hence the transformation 
can be described as a reflection and its operator Rx as a reflection operator, 
/?x is defined by the above transformation equations, or by 

• • • ,Zf) = f{-Xi,yi,z,,-Xi,y2, ■ • • ,Zf). (40-9) 

The corresponding operators denoting reversal of the signs of the y and z 
coordinates will be designatcid by Ry and Rzj respectively. Another 
operator which shares most of their properties is the product 


K = RxRyRz, 

which can also be defined by the equation 

Kf(x,,yi,zi, ■ ■ ■ ,z/) = f(-x^,-yi,-Zi, • • • ,-Z/). (40-10) 

Let R denote an arbitrary member of the set of four operators 72*, jBj,, 
72*, K. It is easy to see by the argument used for U that 72 is Hermitian 
and unitary. Since an arbitrary function can always be resolved into the 
sum of symmetric and antisymmetric functions by the device indicated 
in Eqs. (40-5), (40*6), and (40*7), these operators represent true dynamical 
variables. 

Rx, Ryy Rgf Ky and all commute with the fundamental Hamiltonian 
of Eq. (32*1) and with each other. They all have purely discrete spectra 
and in fact are readily proved to unite with H to form a normal commut- 
ing set of dynami<‘.al variables. Hence it is possible to expand an arbi- 
trary wave function in terms of simultaneous eigenfunctions of 77, 



306 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


fiaj, jBy, Rzj K, In the case of the two-particle problem of Sec. 28, for 
example, we accomplish this by writing 

oe 71 — 1 I 

= 2) X X (40'11) 

71 = 1 I ^Qm^O 

commutes with Rt and K but not with Rx and Ry. Hence we cannot 
expand in terms of simultaneous eigenfunctions of £>g and Rx or Ry but 
can expand in terms of simultaneous eigenfunctions of Rg, and K, 
450 c. The Rotation Operator. — (Consider next a many-pariicle problem 
in which the potential energy V is independent of a rotation of the system 
as a rigid whole about the z axis. Such a rotation is described by the 
transformation 

! Xi — > li = Xi cos CO — yi sin co,l 
Vi -^'Si = sin CO 4- Vi cos co,> (40-12) 

2,: -> Zi = ‘ ) 

V is by hypothesis invariant with respect to F, and the symmetry of the 
Laplacian operator insures that th(^ complete energy operator H shall 
commute with F. 

The operator F has a reciprocal obtained by reversing the sign of 
the angle co. Clearly the application of F or F ‘ to the complete integrand 
of an integral extended over all coordinate space (*annot affect the value of 
the integral. If ^ and x are any two quadratic.ally integrabhj functions, 
it follows that 

= ^jmP-Wdr = (F^,X). 

Thus F is unitary with respect to c.oordinato space* and the (*lass of all 
quadratically integrable functions. 

Expanding the function F^ == ^ in power series in co, we obtain^ 

F■^ - «.) + + . ■ • ; ( 4013 ) 

But 

\dci)Jta»mO dco dpi do) dcoj«»o 



j 


, and 



^ The expansion is legitimate if ^ (a?) is of class 2>, for in that case ip(x) is analsrtio 
in the Cartesian coordinates except at points where V is infinite. Hence it is analytic 
in ^^ahd it follows that is analytic in w. 



SBC. 40] SYMMETRY PROPERTIES OF WAVE EQUATION 
Hence (40-13) is equivalent to 


307 


k^O 


2in _ 


(40*14) 


Equation (40*13) leads us to a functional relationship between the 
rotation operator F(o)) and the familiar angular momentum operator 
Since £* is a real dynamical variable, it follows that F(a)) is, a dynamical 
variable. If had not been defined previously, we could define it by 



(40*15) 


Moreover, the Hermitian character of and the fact that it is an integral 
of the motion are simple corollaries of the properties of F(a)). Thus 
we can say that the existence' of the family of rotation operators F(w) 
which commute with H generates the integral of the motion 

Let us next consider the case where V is invariant with respect to 
all rigid rotations of the axes. Then an operator representing any such 
rotation will commute with H, The most general rigid displacement of 
the axes can be accomplished by a single rotation through a suitably 
chosen angle co about a suitably chosen axis. If the direction cosines 

of the axis are X, /a, v, the rotation is represented by the vector w whose 
components are co* = Xco, = vo). To apply such a rotation 

to we can imagine a preliminary shift in reference axes from the 
primary Xj z system to a system x\ y', 0 ' so chosen that 2 ' has the direction 

of CO. If we now apply the ojjerator e * * and then transform back to 

the X, y, z system we shall have the equivalent of a direct rotation of the 
a:, y, z axes of the desired kind. But the operators Zx, are readily 

seen to transform like the components of a vector. Hence 

Stz' = X£* + yL&y + v£,z. 


Consequently the desired substitution is effected directly on the original 
wave function in Xy 2 /, z coordinates by the dynamical- variable operator 

G(co) = € * = e * 


The operator (?(a?) is not to be identified with the product of the three 


2irt 


non-commutative operators e * 


<•»*£* 


2irt 


■<ay£,y 




power-senes expansion 




h) 


, and e * , but with the 

(cox£a> + WyeCy -b is an 



308 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


analytic function of co*, o)y, If we expand G(o)) as a formal multiple 
power series in o?*, Wy, co*, the coefficient of each term in the expansion is a 
homogeneous form in which is either Hcrmitian, or becomes 

so when multiplied by i. Thus G{o>) generates a large multiplicity of 
Hcrmitian operators (including £>yf £>z) which commute with H because 

G(a)) commutes with 

The operators 6(a)) form a continuous group in the technical sense 
of the word.^ They are a subgroup of the more general rotation-reflection 
group which includes also all operators of the type Rx, Ry, Rz- The 
operator F{w) == G(0,0,7r) reverses the direction of the x and y axes and 
is therefore equal to the product of RxRy> It follows that the product 
of Fiir) into K is Rz, Since Fyir) is a function of Rz is a function of the 
commuting operators £2 and K. Similarly Rx and Ry are functions of 
£,x and K and of Sly and K, respectively. Consequently every Substitution 

in the rotation-reflection group can be effected with the aid of K and 6(a)). 

Since the application of any member of this group, say 6(a)), to the 
integrand of a definite integral extended over all configuration space 
cannot change the value of the integral, we have 

= fG(x*Hi)dr = f^iHG^Wxrdr = WG^fix)- 

Consequently every such operator transforms the Hcrmitian manifold of H 
into itself. 

If K and every 6(a)) commute with an operator a, it follows that every 
dynamical variable generated by the group of substitutions commutes 
with a. This is true of the dynamical variables K and 

40d. The Permutation Operators. — In addition to the invariance of 
the Hamiltonian operator of a free atomic system with respect to sub- 
stitutions of the rotation-reflection group there is an invariance with 
respect to a second set of substitutions which permutes the coordinates 
of equivalent particles, i.e., permutes the coordinates of different electrons 
of the system, or of different protons, as the case may be. Symmetry 
operators of this second type also generate dynamical variables which 
are integrals of the motion. 

The simplest class of permutation operators are the interchange 
operators or transpositions which permute the coordinates of two particles 
only. Thus Pa is the operator which performs the transformation 

Xi-^Xf, Vi-^Vh 

Xj Xi, yj — ► yi, Zj — ^ Zi, 

^ Cf,j e.g.f Carl Eckart, Rev. iMod. PHye, 2, 305 (1930) or the treatises on quantum 
mechanics from the group-theory point of view by Wigner, Weyl, and van der Waerdeu. 



Sec. 40] SYMMETRY PROPERTIES OF WAVE EQUATION 


309 


and so interchanges the positions of the ^th and jth particles. It is 
convenient in discussing the permutations to designate the complete 
set of Cartesian coordinates of the tth particle by ft. With this notation 
we can say that any permutation P effects a rearrangement of the 
arguments f i, * * * , fn of the ^ function to which it is applied. Including 
the identical permutation, the number of different permutations which 
can be applied to a wave func^tion describing a system composed of n 
equivalent particles is n! 

The result of two successive permutations is a permutation w^hich we 
call the product of the other two. It is immediately evident that every 
one of the more (jomplicated permutations can i)e built up as the product 
of a number of simple interchange permutations. The associative law 
holds for permutation products and each permutation has a reciprocal, 
or inverse. Hence the permutations form a group like the rotation and 
reflection operators. 

The application of a, permutation operator to the integrand of a 
definite integral is like the application of a rotation operator to such an 
integrand in leaving the value of the integral unchanged. It follows 
(c/. p. 308) that the permutations transform the Hermitian manifold of 
H into itself. The reader will also verify that they transform class D 
into itself. 

Let P'^ denote the recij)roc.al of P, By the above rule 

Thus P~^ is adjoint to P, and P has a unitary manifold which includes all 
quadratically integrable functions. 

The interchange operators Pa have the additional property that 
Pij^ = 1. Hence they are Hermitian as well as unitary and each has a 
complete system of eigenfunctions with the eigcuivalues ± 1. The 
Hermitian manifold of any interchange operator includes class D so 
that these operators are real dynamical variables of type 1. Although 
the general permutations are not Hermitian, they also can be proved to 
have complete sets of eigenfunctions with discrete eigenvalues and are 
proper dynamical variables. Every permutation when raised to a suit- 
able minimum power n, less than, or equal to, the number of permuted 
particles, gives unity. ^ Hence every eigenvalue is an nth root of unity, 
and is in general complex. Thus the eigenvalues are all of unit absolute 
value in accordance with the general theorem of Sec. 36j, p. 276. 

In order to apply a permutation P to • * ,{n), where F 

denotes a function of the operators for the coordinates and momentei 
we have to permute the subscripts in the expression for F as well ^ 

i This is a general property of groups containing a iinite number of elements (c/., 
e.g.f E. Wigner, Gruppmtkeorie und Ihre Anwendung auf die QimrUeninechamk der 
Atomspektren, p. 65, Braunschweig, 1931. 



310 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 

in the expression for If F is symmetrical in the coordinates and 
momenta of all equivalent particles, and if P permutes the subscripts of 
equivalent particles only, it follows that P commutes with F, In particu- 
lar, P commutes with the Hamiltonian operator of any atomic system 
when the permuted particles are all of the same species. It follows 
that the permutation operators are integrals of the motion like the 
rotation-reflection operators. 

The reciprocal of the product afi is If a and are unitary 

and convert class D functions into class D functions, we have 

(aW, x) = W, oi-^x) = (^, (a^)~^x), 

provided that xp and x are physically admissible. Hence the product of 
any member of the permutation group into any member of the rotation- 
reflection group is a unitary operator. In fact it is easy to see that the 
operators of these two groups, together with their products, can be 
combined into a single unitary substitution group every element of which 
commutes with H in the case of a free atomic system. This group of 
unitary integrals is (tailed the group of the Schrodinger equation y or the 
symmetry group of configuration space. ^ 

The complete impossibility of distinguishing one electron from another 
implies the impossibility of measuring any dynamical variable which 
does not commute with all the permutation operators (c/. Sec. 42b). 
The space-symmetry integrals /?*, Ryy /?«,•*•, do commute with the 
permutation group and are measurable. On the other hand the F\s 
themselves are not symmetrical and do not commute with each other in 
general. For example, 

Pl2Pn^{^hk2,^^) = i^23Pl2^(^l,$2,$3) = ^(^3,^1, £ 2 ). 

40e. Degeneracy and the Integrals of the Schrfldinger Equation.— 

We saw in Sec. 37c that if two type 1 dynamical variables a and com- 
mute, the application of to an eigenfunction of a, say (pn, frequently 
gives a new eigenfunction of a, linearly independent of and belonging 
to the same eigenvalue. Even if is not a type 1 dynamical variable, 
0<pn is an eigenfunction of the type specified provided that it belongs to 
the Hermitian manifold of a and is not a multiple of <pn- 

Every element U of the group of the Schrodinger equation commutes 
with H and transforms the Hermitian manifold of H into itself. If 
xpik is a discrete eigenfunction of H with the eigenvalue Eky it follows that 

^ HUxf^ik == UHxpik = EhU\l/ikf 

1 Strictly si>eaking the permutation group which commutes with a Hamiltonian 
depends on the number of particles of different species in the system under considera- 
tion. Also the complete rotation-reflection group fails to commute with i? if an atom 
is subject to an external electric or magnetic held. Hence the group of the Schrddinger 
equation is variable according to the nature' of J7. 



Sec. 40! 


SYMMETRY PROPERTIES OF WAVE EQUATION 


311 


and that Uypik is also an eigenfunction of H with the eigenvalue Ek^ 
IJyl^ik will be linearly independent of unless the latter function is 
itself an eigenfunction of U. Unless xpik is a simultaneous eigenfunction 
of all the operators in the group of the Schrodinger equation, some of the 
transforms will necessarily be linearly independent of ipiky showing that Ek 
is degenerate. 

If degeneracy is to result from the existence of an integral T of the 
Schrodinger equation, whether unitary or not, it is necessary that there 
shall also exist a second integral, say S, which does not commute with T. 
The condition is a corollary on the second theorem of p. 287. From that 
theorem it follows that if all integrals of the Schrodinger equation 
commute with any one of them, the operator in question must be a 
function of the energy operator. But a function of th(^ energy cannot 
transform an eigenfunction of II into a linearly independent function. 
Hence every operator which can be said to produce degeneracy must 
(commute with H and at the same time fail to commute with a second 
integral, S. 

The existence of two non-commuting integrals T and S is not sufficient 
of itself to establish degeneracy, but becomes so if the integrals T and 
S have the property of transforming eigenfunctions of H into functions 
which belong to the Hermitian manifold of //. We state this conclusion 
in the form of a theorem. 

Theorem: Let T and S denote a pair of integrals of the Schrodinger 
equation and let ypik denote an eigenfunction of H with the eigenvalue Ek- 
If {TS — ST)yl/ik 5^ 0, and if T^pik and S\l/ik belong to the Hermitian mani- 
fold of H, it follows that the energy level Ek is degenerate and that there 
exist two or more simultaneous eigenfunctions of II and T, or h and <S, 
which belong to Ek, but to different eigenvalues of T, or S, as the case may be. 

Proof: It follows from the hj^potheses that T\Pik and Syf/ik are eigenfunctions of H 
with the eigenvalue Ei- [cf. Sec. 37c). Siin^e {TS — ST)\i/tk ^ 0, cannot be a 
simultaneous eigenfunction of all three of the operators Hy S, Hence one of the 
functions T’vi'z* and Sypik must be linearly independent of ypik and Ek must be degenerate. 
We assume that is not an eigenfunction of T — the argument is the same in form 
if we assume that is not an eigenfunction of S. Since Ek has at most a hnite 
degeneracy {cf. Sec. Z2k) we can express the eigenfunction as a linear combination 
of a finite orthonormal set of eigenfunctions, say ^ 2 *, • * * » This set must 
contain at least two members. Making a similar expansion of the transfonn of each 
member of the set by Ty we obtain X equations similar to (37*11). By means of a 
linear transformation {cf, p. 284) we can always obtain a new orthonormal set of 
functions which are simultaneous eigenfunctions of H and T, These new functions 
cannot have a single common eigenvalue for T because ypik is a linear combination of 
them and by hypothesis is not an eigenfunction of T. This completes the proof 
of our theorem. 

Since the dynamical variables which make up the group of the 
Schrodinger equation for a free atomic system do not in general commute 



312 COMMVTATION RULES AND RELATED MATTERS [Chap. VTIT 


with each other, and since they transform the Hermitian domain of 
H into itself {cf. Secs. 40c and 40d), it is clear that these operators will 
produce degeneracy. We shall refer to degeneracy of this kind as 
symmetry degeneracy. 

In addition to symmetry degeneracy there are other but less important 
types. Thus in the case of hydrogenic atoms treated without relativity 
and spin corrections there is a degeneracy of the discrete energy levels 
with respect to as we saw in Sec. 29. This degeneracy is peculiar 
to the two-particle problem with the Coulomb inverse-square law of 
attraction and will be called Coulomb degeneracy. It breaks down the 

classific.ation of the energy levels of the general two-particle problem 
and indicates the existence of an operator A', which commutes with //, 
but not with Many such operators could be set up artificially in 
integral form, but no simple substitution has been propos(id which 
accounts for this type of degeneracy as a symmetry property.^ 

Another type of systematica degeneracy not accounted for as an 
ordinary symmetry property is associated with the continuous part of 
the energy spectrum. Consider, for example, the case of a two-parti(4e 
problem like that of the dumbbell model of the diatomic molecule 
(Sec. 2Sh). Whereas in the discrete portion of the energy spectrum 
every level is characterized by an individual eigenvalue of the crowd- 
ing together of the levels in the continuous spectrum makes every eigen- 
value of compatible with every eigenvalue of II. We call this 
continuous-spectrum degeneracy. It is not readily accounted for as a 
result of a symmetry property of the Hamiltonian. 

In view of the circumstances giving rise to continuous-spectrum 
degeneracy we may say that, in the above case, is a function of H 
in the discrete portion of the energy spectrum hut is independent of H in 
the continuous portion of the spectrum. Within the discrete spec^trum 
H and suffice to form a complete set of normally commuting dynamical 
variables, but, when the continuous spectrum is taken into account, we 
must add to H and £« to get such a set. Exactly the same phenomena 
mark the passage from the discrete to the continuous spectrum in the 
case of a general many-electron atom. 

Finally we have to consider cases of non-systematic degeneracy, or 
accidental degeneracy, due to the chance energy agreement of individual 
stationary states which do not belong together as a result of any general 

^ Recently, however, V. Fock, Bull, de VAcaddmie des Sciences de UURSS, p. 179, 
1935, has shown that the Schrodinger equation 

for hydrogenic atoms in momentum space is identical with the integral equation for 
spherical harmonics in four-dimensional space. Thus the Coulomb degeneracy can 
be tied up with the symmetry properties of four-dimensional space. 



Sec^. 40] SYMMETRY PROPERTIES OF WAVE EQUATION 313 

rule. Examples of this kind are rare except in problems in which the 
Hamiltonian involves a continuously variable parameter, such as the 
strength of an electric or magnetic field. In such problems accidental 
d(^generacy is apt to occur for certain specific values of the parameter in 
question at which a pair of energy levels are said to ^‘cross each other. 
Since accidental degeneracy is exceptional, unpredictable, and easily 
recognized when it does occur, it can be ignored in a general discussion of 
normal systematic degeneracy. 

It will now be seen that if we restrict our discussion to the discrete 
s])ectra of atomic, systems with two or more electrons, we have to consider 
only the symmetry dc^generacy associated with the group of the Schrod- 
inger equation. This problem will be attacked on the basis of the 
postulate that all eigenfunctions of H for any discrete energy level Ek are 
contained in the linear manifold formed from the transforms of any one 
eigenfunction by the operators in the unitary substitution group of the 
Schrodinger eguatioriy together with their linear combmationsA This is 
the fundamental postulate of all applications of group theory to atomic 
structure problems. It is plausible a priori and is substantiated both 
by experiment and by detailed calculation. It implic^s that the linear 
manifold of eigenfumjtioiis belonging to Ek and obtained by application 
of the unitary group of the Schrodinger equation is independent of the 
choice of tlui initial wave fuiuition ^ia-, so long as it belongs to Ek ^ This 
means that all eigenfunctions belonging to Ek have fundamentally the 
same symmetry with respect to the operations which define the group. 

40f. The Normal Degeneracy of the Energy Levels of Free Atomic 
Systems. — Owing to the existence of symmetry degeneracy we know that 
even within the discrete spectrum the Hamiltonian H of siudi a system 
does not of itself constitute a complete set of normally commuting 
dynamical variables. Our first problem is then to choose additional 
dynamical variables which either belong to the group of the Schrodinger 
equation, or are related to it, and can be added to H in order to build 
up a complete normally commuting set. The degeneracy of any energy 
level Ek is then equal to the number of sets of eigenvalues of these 
additional operators which are compatible with Ek^ Since all the per- 
mutations commute with all rotations and reflections, we can deal 
separately with the two subgroups of operators. 

Consider first the rotation-reflection group. Finite rotations about 
different axes do not commute. This we know from geometry and from 
the fact that the corresponding components of angular momentum do 
not commute. On the other hand, all rotations about any one axis 
commute with each other and the corresponding component of angular 
momentum. Therefore, if we add to H any finite rotation, or any 

^The phrase * ‘linear combination” is to be interpreted to include such limiting 
cases as the derivative given in Eq. (40*16). 



314 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 

component of angular momentum, we shall have a normal commuting 
set of operators which does not commute with any independent member 
of the rotation group. This situation is unchanged if we pass from the 
pure rotation group to the rotation-reflection group, for K commutes 
with all the rotation and angular-momentum operators so that the 

commutation properties of KG(o3) are determined by those of G, 

Let us arbitrarily choose Jug as the second member of our commuting 
set. If the system under consideration contains not more than two 
identical particles, there are only two permutations, one of which is the 
identical permutation and both commute with II and In this case 
we infer that the set of operators Hy£,g is complete. If there are more 
than two identical particles, it will be necessary to add one or more 
permutations in order to secure a complete set. 

The operators and K commute with every component of angular 
momentum and every operator of the rotation-reflection group. For 
example, 

= J^yJdg + £>g£>y - £,g£.y - £,y£>g = 0 . 

- ■ "■ 

k 

In fact, these operators commute with every member of the group of the 
Schrodinger equation. Since they are integrals of the motion, there 
exists a complete system of simultaneous eigenfunctions of and F, 
Let ypik denote an eigenfunction of this type Associated with the discret j 
energy level Eh- From the postulate in italics on p. 313, it follows that 
every eigenfunction of H with the eigenvalue Ek can be derived from 
ypik by taking linear combinations of transforms of rl/ik by the vari;m:i 
bperators in the substitution group. But, since Ju^ and K commute w ilh 
all members of the group of the Schrddinger equation, aU eigenfunctions 
of H which belong to Ek are eigenfunctions of and K with the same 
eigenvalues as In other words, there is just one pair of eigenvalues 
of and K which is compatible with any discrete eigenvalue of H- Thus 
JB® and K are functions of H insofar as concerns the discrete spectrum. 

Thus the eigenvalues of JB* and K form convenient indices for the 
classification of energy levels. Stationary states which are symmetric 
with respect to K {K' « +1) are said to be even, while those which 
are antisymmetric are said to be odd- This classification of levels into 
even and odd types holds rigorously even when the electron spin is taken 
into account, The spin-orbit interactions partially i^oil, the JB* classifica- 



Sec. 40] 


SYMMETRY PROPERTIES OF WAVE EQUATION 


315 


tion, however, replacing it by a similar classification, where g denotes 
the resultant of the orbital and spin angular momenta. In the case of the 
two-particle problem without spin, where has the form of Eq. 
(34- 18), eigenfunctions of with even values of I are symmetric with 
respect to K while those with odd values of I are antisymmetric. Thus 
K is a, function of for such systems. 

Evidently the number of different values compatible with a given 
energy level not degenerate with respect to is not greater than the 
number of £z values compatible with the corresponding value of L, viz,^ 
2L + 1 {cf. Sec. 34/, p. 234). Neither, on the other hand, can the 
degeneracy be less than 2L + 1. To prove this we follow an argument 
used by Dirac. ^ Let be a simultaneous eigenfunction of 

with the discrete eigenvalues L(L + Mh/2Tr, respectively. 

Since £* and commute normally with H and 

must either vanish identically or be an eigenfunction of H and £>^ which, 
like \l/E.L,Mt belongs to E and L. The same remark applies to 

(<£jj 

The commutation relations (38T5), (38T6), and (38T7) imply that 


(£>x “f“ i£>y)Sl>z £>z{£>z 


+ iS^y) — 27r^'^* 


or that 


+ i£^) = 




(40- 16) 


Hence 


“H i£>y^^E,L,M 


{M + 1)^{£>X + i^v)^E,L,M» 


(40- 17) 


We conclude that (£» + i£y)\pE.L,M is an eigenfunction of £* with the 
eigenvalue (M -f- l)^/2Tr unless it is identically zero. But, by (38T5), 


(£» - ije„)(£* + t£,) = + £»* - 






') 

(40-18) 


Hence 



L(L + 1) + j ^ 

(l + - (m + . 

(4019) 


* P. A. M. Dibac, P.Q.M., 1st pd., p. 90. 



316 COMMUTATION RULES AND RELATED MATTERS [Chap. VIII 


Thus + i£>y)\l/E,L,M can vanish only when M = L, or —(L + 1). 
The latter case is impossible, for the relation 
applied to a simultaneous eigenfunction of £>^ and <£* yields 

- £.^)^^.L,MdT = {^\uL + 1 ) - £ 7 +^ > 0 . 

(40-20) 

We conclude that it is impossible for |ikf | to be greater than [L(L + 1)]^^^ 
and for M to take on the value — (L + 1). Hence (£* + is 

an eigenfunction of <£* with the eigenvalue (Af + l)/i/2x unless Af = L, 
in which case the function vanishes identically. Similarly 

(£x — i£>y)\l/E.L,M 

is an eigenfunction of £* with the eigenvalue (Af — \)h/2Tr unless 

M - -L, 

in which case it vanishes identically. In short, the existence of the 
simultaneous eigenfunction ypE,L,M implies the existence of a set of 
(2L + 1) such functions: • * * , ypE,L,Mj * • * , ^I'e,l,l. 

This means that if and £>t commute with the Hamiltonian H (free 
atom or molecule), an energy level compatible wnth the quantum number 
L must have (2L + l)-fold degeneracy and must be compatible with all 
values of Af from — L to +L. The manifold made up of the 2L + 1 
simultaneously eigenfunctions of H, and £>s with their linear com- 
binations will be called an L complex. 

In the above discussion L and M were assumed from the beginning to be integers 
as required by Sec. 34, For future reference, however, it is desirable to note that 
with one slight ambiguity the same argument can be used to derive the eigenvalues 
JB* and JBc as well as to determine the degeneracy of a pair of mutually compatible 
eigenvalues of H and 

Let us accordingly consider a set of three dynamical variables a, jS, y which obey 
the commutation rules 

[a,/3] = [|3,7l ~ —ct [ 7 , a] = -/3. (40*21 ) 

Let us denote the typical eigenvalue of 7 by the symbol w/i/ 2 ir without making any 
assumption regarding the values, integral or non-integral, which the parameter m 
can take on. Let w* denote the operator 

w* = a* -f- 4 - (40*22) 

We denote the eigenvalues of w* by l{l -f l)fe*/4ir*, where I like m is to be treated as a 
(wmplete unknown. 

a, 7 are assumed to transform class D functions into class D functions. Replac- 
ing £y, JB* in the above argument by a, 7, respectively, we see that the existence 
of a simultaneous eigenfunction of F,<i)*, 7 , say ipE,i.m implies the existence of the set: 

‘ f • • * , The values of m form a series, the suc- 

cessive members of which differ by unity and which terminates at the values ± I, 



Sec. 401 SYMMETRY PROPERTIES OF WAVE EQUATION 


317 


Hence I and m must both be restricted to integral values, or to values which are odd 
multiples of 3^. The operators. JC*, JCj,, £* give a special case in which m and I take on 
integral values. In the discussion of electron spin (Sec. 61b) we shall see that identi- 
fications of the operators a, /9, 7 , w* exist in which the “quantum numbers^’ I and m 
are restricted to half-integral values odd multiples of 


Having established the fact that the mutual compatibility of the energy 
level E and the angular-momentum quantum number L implies the 
existence of an L complex of simultaneous eigenfiinc^tions of //, and 
we turn to the question of the additional degeneracy to be expected 
from the permutation group of symmetry operators. A complete dis- 
cussion of this question will not be attempted, but one important fact 
about it will be established. 

Let us assume that by applying the operators of this group to the 
function = ^e,l,m it is possible to generate N — 1 linearly inde- 

penderit n(?w functions, * * * > Because every permuta- 
tion operator commutes normally with each of these functions 

is a simultaneous eigenfunction of //, £>^j Juz which belongs to E, L, and M, 
Without loss of generality we can assume that the functions in this set 
are normalized and mutually orthogonal. By application of the opera- 
tors ± iiii/ to each of th(^ functions we (‘,an generale a complete 
corresponding L complex. In this way we obtain a total of N(2L + 1) 
wave functions for the given pair of values of E and L, All of these wave 
functions are orthogonal and hence linearly independent. Consider, 
for example, the pair of functions wheire k 9 ^ k': 


( V, M-\- 1 > L , Af -f 1 


) - ((£. + 




4- ) 


'E.L,M} ('^a: "h 

[L(L + 1) - = 0. 




The problem of evaluating N is more difficult than that of determining 
the size of an L complex. Wigner^ has dealt with the problem by the 
methods of group theory and Hund^ has given an alternative procedure 
which avoids those methods. However, it would be of little use for us 
to give a general discussion of this problem here since we have not yet 
introduced the electron-spin coordinates and the Pauli exclusion principle 
into the theory. In Chap. XIV we shall show how to deal with the 
question very simply in the light of the exclusion principle and the theory 
of electron spin. 

1 E. WiGNER, Zeits. /. Physik 40 , 492 (1926), 40 , 883 (1927), 48 , 624 (1927). 

* F. Hund, Zeits. f. Physik 48 , 788 (1927). 



CHArTER IX 


THE MEASUREMENT OF DYNAMICAL VARIABLES 
41. GENERAL THEORY OF MEASUREMENT 

41a. Fundamental Characteristics of Measurements. — Having given 
a purely mathematical definition of a general dynamical variable a, it 
beliooves us to discuss the nature of the process of observing, or measur- 
ing, this variable. In classical physics and in quantum physics an 
observation is basically an operation involving an intera(*tion between 
the system, or object, under observation, and an observing mechanism. 
The interaction l)(*tween the observed system and the observing mecha- 
nism terminates in sense p'erce])tions on the part of the observer. In 
addition to this physical operation tluTc must be a rule, basf^d on defini- 
tion or theory, for the computation of a numl)er,* or numbers, from the 
sense impressions. 

The purpose of a measununent is always to determine some property 
of the system, whether that property be a fixed one determined by the 
inherent structure of the system, such as the atomic number of an atom, 
and defining, or defined by, its Schrodinger equation, or a variable 
property depending on its subjedive state. Our present discussion has 
to do solely with the measurements of the latter kind, i.e.^ measure- 
ments of the quantum analogues of the dynamical variables of classical 
mechanics. Wc tln^refore assume that the structure of the system 
observed, and its Schrodinger equation, are known a 'priori. 

The problem of relating the experimental procedure for the measure- 
ment of a dynamical variable with the mathematical definition can be 
approached from either side. As in the discussion of momentum in 
Sec. 15 we can assume an experimental procedure and attempt to deduce 
from it the nature of the corresponding mathematical operator, if any; 
or, we can start from the mathematical definition of a dynamical variable 
as a class of operators and attenlpt to devise an experimental procedure 
which will measure the eigenvalues. In cither case a theory of the 
interaction between the observed system and the observing mechanism 
is necessary. 

It is convenient to distinguish from the beginning between individual 
measurements designed to attach a specific number to an individual 
system, and statistical measurements designed to determine the possible 
eigenvalues of an operator and their relative probabilities for an assem- 
blage of identical systems. As previously indicated, the test of quantum- 

318 



Sec. 41] 


GENERAL THEORY OF MEASUREMENT 


319 


mechanical theory is to be found in statistical measurements. A 
contradiction between theory and experiment can be established by means 
of an individual experiment if it yields a result in conflict with the 
scheme of theoretically allowed eigenvalues. Complete confirmation 
of the theory can be obtained, however, only with the aid of a suitable 
assemblage. Hence statistical exi)erinients are of primary concern. 

A statistical measurement designed to test the nnulictioiis of wave mechanics 
should consist ideally of many entirely distinct individual measurements made on the 
elements of an ideal Gibbsian assemblage of similarly pr(^i)ared identical systems. 
Where these systems are of a microscopic character, however, it is not customary to 
make observations in this ideal laborious fashion. As indicated in Sec. 146, p. 55, 
we can then substitute wholesale observations on a concrete assemblage of approxi- 
mately independent systems for the series of individual obsfTvations. Thus an 
ordinary spectroscopic measurement obtained by a single exposure of a pliotographic 
plate gives the statistical distribution of the energies of the photons emitted by a 
particular light source. 

It is to be noted that an attempted individual observation can be either successful 
or unsuccessful. For exjimple, if one attempted to measure the energy of an indi- 
vidual photon by sending it through a spectroscope, it miglit i)e reflected from one 
of the prism faces and so fail to reacli the dett'cting apj)aratus. Or, if oiuj attempted 
to observe the position of an electron by arranging a collision with a second el(‘ctron, 
the experiment might fail owing to poor marksmanship. TTsually an apparatus for 
the statistical rneasurejnent of a dynamical variable is theoretically capable of giving 
individual measurements when used in coiijunction with individual observed systems, 
but there may be serious practical difficulties about the latter in the case of atomic 
systems and the efficiency of such observations (percentage of successful experiments) 
would frequently be small. 

At the outset of this discussion it will be well to emphavsize the differ- 
ence between a scheme of measurement which is satisfactory from the 
(dassical standpoint and one which is satisfactory from the quantum- 
mechanical point of view. In the case of macroscoiucj mechanical 
systems the momenta are so large, on the average, that diffraction 
effects and momentum changes due to observations of position are 
negligible. For such systems a coordinate and its conjugate momentum 
are simultaneously measurable, in theory, with an error which can be 
ignored for all ordinary purposes. It follows that any function of the 
positional coordinates and their momenta is measurable with a precision 
which is wholly satisfactory from the point of view of classical dynamics. 

In the case of microscopic systems, however, the limitation on the 
simultaneous measurements of coordinates and momenta inherent in 
the Heisenberg uncertainty principle is a serious one. Thus the meas- 
urement of atomic energies by a calculation based on approximate 
simultaneous observations of the coordinates and momenta is both theo- 
retically and practically impossible. Happily, however, methods have 
been developed for the measurement of atomic energies (the spectroscopic 
and electron-collision methods) which yield consistent results of very 



320 


THE MEASUREMENT OF DYNAMIC Ah VARIABLES [Chap. IX 


great accuracy. These methods define the energies of atomic systems 
operationally, and our calculations indicate that the values observed 
are the eigenvalues of the corresponding Schrodinger Hamiltonian 
operators. We are thus led to require of a quantum-mechanically 
satisfactory method of measurement that it shall be capable in principle 
of yielding results of essentially arbitrary precision independent of 
limitations due to the Heisenberg uncertainty princii>le. Such a measure- 
ment must not involve the simultaneous determination of the values of 
conjugate coordinates and momenta. In view of the r€\sults already 
obtained for the energy and momentum we look for the correlation of 
every measurement of this type with a corresi)oiuling Herrnitian (q)erator. 

41b. Pure States and Mixtures. -We liavc' assumed from tlje begin- 
ning the existence of assemblages of identical systems so pre])ared that 
their statistical properties can be descrilxHl l)y means of a wave function.* 
Here the statement that an assemblage' exists is meant to imply merely 
that in principle it is possible to ])rei)are such an assemblage. The 
union of different identically constructed systems into an asse'inblage 
is a mental operation rather tlian a physical one. Hence it is nece^ssarily 
possible to unite the systems of two or more assemblages into a single 
superassemblage. Let Na and Nb denote the numbers of systems in the 

Na 

pure state assemblages A and B. We refer to the ratios Wa = 

Nb 

Wb = the weights of A and B in the superassemblage A + B. 

Let W Ay Wb denote the probabilities of an event, or experimental result, 
for the assemblages A and B, respectively. Let W a^b denote the prob- 
ability of the same event for the combined assemblage. As the systems 
in A and B are completely independent, it is rmcessary that 

Wa^b - WaWa + WbWb. (4M) 

So much follows from the basic; rules regarding the calculation of all 
independent probabilities. 

Following von Neumann we designate an assemblage whose statistical 
properties can be described by a single wave function as a pure case, or 
pure state. An assemblage obtained by mixing two or more pure-case 
assemblages, or having the statistical properties of one so obtained, will 
be called a mixed-case assemblage, or briefly, a mixture. 

The expectation value of a dynamical variable a for a pure-state 
assemblage with wave function is a = In the case of a 

mixture of pure-case subassemblages with the wave functions ^2, 

• • • • • • and the weights Wi, W2, * • • Wj, • • • we have 

} 


1 Cf. pp. 62, 70. 



Sec. 41] 


GENERAL THEORY OF MEASUREMENT 


321 


Tlio question now arises whether a mixture of two or more pure-case 
subassemblages is equivalent to a single pure-case assemblage or is 
actually something more general. 

To settle this question we assume that the answer is affirmative and 
prove a contradiction. Let the wave functions 4' i, ^ 2 , • • • • • • 4^,, 

of the (*om))onents of the above specifi(‘d mixture be simultaneous eigen- 
functions of a complete normal set of independent commuting dynamical 
variables ai, a 2 , * ‘ . Let denote the set of eigenvalues of the 

fv^s which belongs to 'I'y. These sets are assumed all different. The 
l)robability of for the mixture is icy. But if the mixture is equivalent 
to a pure-cas(‘ assemblage, the corresponding wave function must be a 
liii(\ar combination of the wave functions of the subassemblages, say, 

n 

'1' = (41-2) 

The probability of the set of eigenvalues computed from A' is |cy|“. 
Oonsistency demands that |ry|^ = Wj. In order that ^ shall also repre- 
s(nit the statisti(*al properties of the su})erassemblage with respe(*.t to the 
positional coordinates it is necessary that 

1^1' = XM’il'J'yl', (41-3) 

./ - 1 

n 

whereas (4L2) implies that = 1^ These equations are in 

general inconsistent. Hence we conclude that in general a mixture of 
pure cases is not a pure case. An important exception occurs when we 
mix several piire-cas(' assemblages whose wave functions are multiples 
of one another. It is immediately evident that in a case of this kind, 
Eqs. (41-2) and (41-3) involve no contradiction. Hence such assemblages 
(^an always be united into a single superassemblage which is a pure 
state with a wave function which is a multiple of the wave function of 
each of the subassemblages. 

When all we know of the state of a system is that it belongs to a 
mixture of pure-state assemblages, we shall speak of it as being in a 
mixed state. 

The difficulty which makes it impossible to replace a mixture of 
non-identical pure-state assemblages with a single equivalent pure-state 
assemblage also prevents the resolution of a pure-state assemblage into a 
mixture of non-identical pure-state subassemblages, von Neumann 
has treated these questions at length.^ The important fact for us to 
note here is that an assemblage of similarly prepared identical systems 
must be assumed to be a mixture rather than a pure state unless the 

lyoN Neumann, M.Q.Q,, IV, 1-3. 



322 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


method of preparation has been such as to insure that no system shall be 
admitted to the final assemblage which is not in the same state as every 
other system. The methods used in preparing such a pure-state assem- 
blage have been indicated in Sec. 15/, p. 70. 

41c. Postulates Regarding Retrospective and Predictive Measure- 
ments. — There are two different aspects of measurements or observations 
which it is convenient to dMinguish at this point. These are the back- 
ward-looking aspect and the forward-looking aspect. Let us consider 
them first of all from the classical standpoint. In classical theory there 
is no reason why the measurement of a dynamical variable need alter 
its value or change the state of the system under observation. Hence 
it is customary to assume that classical measurements give the values 
of the quantities observed both before and after the measurement. 
Actually, however, there are plenty of classical j)rocesses which we can 
hardly fail to classify as measurements in which the state of the system is 
not preserved. A chemical analysis does not leave tlie material under 
investigation in its original state but gives properties of the sample 
analyzed before the measurement. It may result in the prei)aration of 
samples of pure materials and hence yields data concrerning thti properties 
of these end products. The information regarding the future is thus of an 
entirely different character from that regarding the past. Moreover the 
backward-looking and forward-looking aspects of a measurement can 
be completely dissociated. It is possible to make backward-looking 
measurements whi(di (diange the state of the system under observation 
and yield no definite information regarding its final state. We can also 
make forward-looking measurements which have no backward-looking 
aspect. The former is illustrated by a measurement of momentum 
following the method of Sec. 15, but making the final observation of 
position by a violent impact. The latter is illustrated by methods of 
factory production in which a standardized produ(;t is turned out by 
means of a punch or templet. 

From the standpoint of quantum mechanics the distinction between 
the backward-looking and forward-looking aspects of measurement is 
more important than from the standpoint of classical theory because the 
interaction of the observing mechanism with the observed system, 
ignored in the classical theory, prevents the design of measurements 
which always leave the system in exactly the same state after the observa- 
tion as before. It is sometimes possible to carry out a measurement so 
that if the system observed is initially in an eigenstate of the dynamical 
variable a which is to be measured, its wave function, and hence its 
state, are the same after the measurement as before. With the same 
experimental arrangement, however, the wave function will be changed 
by the measurement if its initial form is not that of an eigenfunction. 
Thus in the case of any measurement of a microscopic system capable of 



Sec. 41] 


GENERAL THEORY OF MEASUREMENT 


323 


exhibiting quantum-mcchanical propertien, the information obtained 
regarding the initial state is the same as that regarding the final state, 
only if the initial state itself, as well as the experimental procedure, is 
properly chosen. We accordingly label a measurement as retrospective 
or predictive according as it yields information about the immediately 
pre(jeding or immediately succeeding state of the measured system. 
Of course, it is possible for a measurement to be both retrospective and 
predictive. 

After the above preliminary discussion we are in a position to lay 
down certain theoretical requirements, . or postulates, regarding the 
measurement of a dynamical variable a defined as in Sec. 36. 

a. A successful, exact, individual measurement of a, whether retro- 
spective, or predictive, must always yield as its numerical result one of the 
eigenvalues of a. 

b. If an exact individual predictive measurement is followed immediately 
by a successful exact individual retrospective measurement, the two results 
must agree. 

c. The prohahility of any eigenvalue a! or any range of eigenvalues da^ 
in an individual retrospective measurement of a for a system initially 
belongifig to a pure state assemblage must be in harmony with the Q formulas 
(36*77) and (36*78). 

In th(^ case of a mixed type of assemblage the probability will then be 
given automatically by the appropriate generalization of (41*1). 

In conformity with c the theoretical calculation of the probability of 
an elementary range of eigtni values da' for a pure-state assemblage is 
to be carried out as follows: First one must resolve the Cartesian wave 
function into a series-integral (spectrum) whose elements are eigen- 
functions of a. One next forms the partial sum (or integral), say x) da' ^ 
of those elements of this spectrum which belong to the range da' and 
identifies the desired probability Q(a')da' with the scalar product, or 
integrated intensity, ((4^ *)«««', (4^x)<ia') of this partial wave function.^ 

1 According to the scheme of PJq. (36*77) one should choose a (complete set of 
dynamical variables g' 2 , * * * of wliich a is a member and compute Q{a')da' frf>m 
the probability amplitude 

(q) 

^q{q') can be regarded as the sum of orthogonal elements each of which is 

egual to in the corresponding range of a eigenvalues and vanishes outside that 
range. The transformation of these elementary functions back into Cartesian- 
coordinate space, according to the rule 

(♦*)*.' = (»,)*.', 

gives a sfet of exact or approximate eigenfunctions of a in Cartesian space whose sum 

(®) 

is Owing to the unitary character of T^^' the integrated intensity of i^x)da* 
in Cartesian space is equal to the integrated intensity of {qfq)da' in q' space, which is 
evidently equal in turn to the value of Q{ct*)da! worked out from Eq. (36*77). 



324 THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 

As previously indicated, any procedure for dealing simultaneously 
with all the elements of an assemblage of sensibly independent systems, 
which yields tire eigenvalues of a and their probabilities, is to be reckoned 
a valid statistical measurement, even though it is not, strictly speaking, 
a compound of many individual measurements. 

It follows from b that an assemblage of similar systems correlated 
with a common eigenvalue a' as a result of a predictive measurement 
must be of such a character that a statistical retrospective measurement 
will show the probability of «' to bo unity. Hence this assemblage must 
be a mixture of pure-state suba§semblages each of which has a wave func- 
tion which is an eigenfunction of a with the eigenvalue a'. In special 
cases the mixture can reduce to a pure state. This will happen, for 
instance, if the eigenvalue a' under consideration is nondegenerate, so that 
all the corresponding eigenfunctions are liiu'arly dependent and physically 
equivalent. 

In the case of a pure-state assemblage prepared by a predictive 
measurement the wave function must b(^ ^^physically admissible,^’ as 
it is created by physical manipulation rather than by mathematical 
definition. This means that it is impossible to carry out exact pre- 
dictive measurements of dynamical variables which do not have class D 
eigenfunctions. We are accustomed, however, to idealize measuring 
instruments by treating them on a classical basis and ascribing to them 
discontinuous properties, such as smooth surfaces and edges, which from 
a microscopic point of view they do not possess. If such idealizations 
are allowed in the theory of a measurement, the^y produce discontinuities 
in the wave functions of assemblages prepared by predictive measure- 
ment. Hence the assumption that the wave functions of pure-state 
assemblages prepared by predictive measurements must be of class D 
is very inconvenient. We shall therefore introduce a milder restriction 
in the form of a fourth postulate, which implies the admission of idealized 
measurements along with those which are actually possible. 

d. The eigenfunctions of pure-state assemblages prepared by predictive 
measurements, or of the pure-state subassemblages which make up a mixed 
assemblage prepared in this ioay, are all guadratically integrabU, 

As the eigenfunctions of continuous-spectrum eigenvalues are not 
quadratically integrable, we infer from d that exact predictive measure- 
ments of continuous-spectrum eigenvalues are fundamentally impossible.^ 
Experience shows that exact retrospective measurement of such eigen- 
values is equally impossible, although this fact is not an immediate 
consequence of our present postulates. In these cases where exact 
measurements are not possible, we assume that inexact measurements, 

^ For many purposes it is convenient to reckon discrete eigenvalues as exactly 
measurable, but the above statement should not be understood to imply that such 
exact measurements are strictly speaking realisable. 



Sec. 41] 


GENERAL THEORY OF MEASUREMENT 


325 


whi(;h conform to a and d and satisfy b and c to an arbitrarily high 
degree of approximation, are to take their place. 

The reader will now inquire by what means we are to make connections 
between the above requirements a, b, c and concrete experimental 
methods. In discussing this general problem it will be of advantage to 
examine first the case of positional measurements. They o(*cupy a 
special place in the development of the theory bec^ause of our initial 
physical interpretation of as probability density in coordinate space. 
In fact our present postulates when applied to positional measurements 
reduce to an amplification of assumption b of Sec. 14a. • 

The interpretation of as probability density has operational 
meaning only if we have in mind one or more mutually consistent definite 
methods of observing the spatial configurations of the systems described 
by 4^. Of course we do have sindi methods for observing the positions 
and orientations of large-scale bodies: They depend primarily on 
optical coincidences of one kind or another, although the senses of touch 
and hearing could be calliHl into play for observations of this kind. Not 
to become too discursive, it will be assumed that any of the classical 
methods for the measurement of (configuration is valid in the domain of 
quantum mechanics if due regard is paid to the finite resolving power of 
optical instruments and to the diffraction of matter corpus(‘l(cs in methods 
of observation by collision. There is a difficulty in principle about the 
observation of uniqiuc atomic configurations owing to the fact that they 
contain many identi(^al partich's (electrons) between which we are unable 
to distinguish. For the moment, however, we shall ignore this difficulty 
and proceced with the discussion as if all electrons and other like particles 
were provided with legible number plates. 

Since all values of positional coordinates are eigenvalues of the 
corresponding operators, it is clear that postulate a is automatically 
satisfied by any of the necessarily approximate positional observations 
in the sense that it yields a short range of possible values from which 
any one (^an be arbitrarily chosen as the result of the measurement. 
In (connection with postulate b we may assume that any observation 
which from a classical point of view gives the position at the moment 
when the observing mechanism begins to interact with the system is 
a retrospective observation, while one which gives the position at the 
moment when the observing system ceases to interact is a predictive 
observation. In general an observation is both retrospective and 
predictive if the time of interaction between the observed system and 
the observing mechanism is small, and the variable measured is conserved 
from a classical point of view during the interaction. As the positional 
coordinates of a free dynamical system are not constant in time, we have 
formulated b as a statement of agreement between a predictive observation 
and a retrospective one performed immediately afterwards. Whep so 



326 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


formulated, the validity of b for positional observations within the limit 
of available precision is a matter of common experience. 

Postulate c is the fundamental postulate of wave mechanics. It can 
be tested only in conjunction with such other assumptions as yield an 
initial ^ ay^propriate to the state in question. We take its validity for 
atomic systems to be a matter of common exi)erience suy)ported by a 
considerable variety of experiments in which the spatial distributions of 
electrons and atoms are found to agree with wave-im^chanical calculations. 

Consider next the applicability of a, b, c to the measurement of 
dynamical variables which are not spatial coordinates. There is an 
important class of measurements for which the validity of these postulates 
is a direct consequence of their validity for observations of configuration, 
viz.^ those measurements which involve an interaction with an observing 
mechanism (in limiting cases this step is omitted) followed by a y)ositional 
observation carri(?d out either on the observed system, or on a y)art of the 
observing mechanism. In suc^h cases the correctness of the postulates 
can be proved or disi)roved from the hypotheses regarding by an 
appropriate quantum-mechanical analysis of the assuinc^d exyjerimental 
procedure. We have already done this for linear-momentum measure- 
ments in Sec. 15. In another class of measurements the intennetion 
between radiation and matter j>lays a fundamental part, and the theory 
of the experiment becomes a special case of quantum elec-trodynami(?s 
which involves postulates not included in the general theory here devel- 
oped. Even for measurements of this type, however, ad hoc hypotheses 
should be unnecessary, the validity of the postulates a, b, and c being 
a consequence of the basic assumptions of the theory. 

These considerations lead us to the formulation of an additional 
postulate : 

e. The statistical distribution of results in the case of any type of experi- 
ment performed upon the members of a pure-state assemblage is calculable 
in principle from the corresponding wave function. 

The experiments under consideration here include many which are 
not statistical measurements of a dynamical variable. Measurements 
of the scattering of electrons by atoms, for example, are not simply 
measurements of the eigenvalues of a dynamical variable but de^termine 
the change in the statistical distribution of momenta of an assemblage 
of free electrons brought about by interaction with atoms. The postulate 
means that a pure-state assemblage has no intrinsic properties not 
implicit in its wave function. 

41d. The Reduction of the Wave Packets — ^We have already shown 
(p. 324) that in consequence of postulates a, b, and c a statistical predic- 
tive measurement of a dynamical variable a must prepare an assemblage 
which is a mixture of subassemblages each of which is an eigenfunction 

1 Cf. Secs. 146, 15/, 16 (p. 75), and 19d. 



Sec. 41] GENERAL THEORY OF MEASUREMENT 327 

of Of with a common eigenvalue a'. Therefore, in general, there must be 
a discontinuous change in the wave function or mixture of wave functions 
associated with a system when one of its dynamical variables is measured. 
As this change has confused some physicists it will be well to analyze it 
carefully. If the observing mechanism, owing to its relatively large 
mass, can be treated on a classical basis, the operation of measurement 
converts each initial wave function into a definite corresponding final 
wave function, t.e., it converts pure states into pure states. The scope 
of the following discussion is limited to rneasurcrnents of this simplest 
variety. In Sec. 42 we shall deal briefly with measurements of a more 
general type. 

When a measurement converts pure states into pure states the dis- 
continuous change in the wave function which accompanies a measure- 
ment is called the reduction of the wave packet. The necessity for such 
a change has been noted in the earlier sections of this book (c/. Sec. 15/). 
It is clearly unavoidable if we accept the notion that is a measure 

of the probability of configurations which lie within the element dr 
of configuration space, and if we assume the validity of b for measure- 
ments of position. To put the argument in its most elementary form 
consider a single-parti(‘le problem with a wave function ^{x^y,z,t) spread 
out over a large effective volume V at the instant U. Let a positional 
measurement made at locate the particle within a relatively small 
volume element dV. If this measurement is properly carried out it will 
leave the particle in the element 6F at ^o, althougli it will inevitably 
alter the momentum. Thus a wave function which is to determine 
the probabilities of different positions for future times must approach 
zero outside dV as t decreases to to. Otherwise there could not be that 
agreement between successive observations of position which creates 
the continuity of our experiences with large-scale objects. 

The assumption of such a discontinuous change in the wave function is 
paradoxical, however, in view of the continuity of ^ as a function of t 
demanded by the fundamental Schrodinger equation (7-3). This 
paradox becomes acute if one attempts to interpret wave functions as 
in some sense physical realities. It is possible, for example, to determine 
the position of a particle A by an indirect method in which it is allowed 
to collide with a second particle B whose “ orbit before and after the 
collision is measured. In this case the wave function of the combined 
system A + 5 is reduced when the scattered particle B is located after 
the collision. This reduction locates the particle A at the moment of 
the collision and also determines its momentum (within the limits 
allowed by the Heisenberg uncertainty principle). Thus an observation 
made on B which does hot involve any disturbance of A can alter com- 
pletely our expectation of the future behavior of the latter and thus com- 
pletely change the type of wave function which must be associated with it 



328 THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


if we are to treat it as an independent system.^ Clearly this fact should 
reinforce our previous conclusion that the wave function is merely a 
subjective computational tool and not in any sense a description of 
objective reality. 

On the other hand the paradox is at once resolved if we adhere strictly 
to the previously formulated interpretation of ^ as a mathematical 
specification of the properties of an assemblage of similarly prepared, 
identical, independent systems. Whenever a statistical predictive 
measurement is carried out we have to do with a selective process in which 



Fig. 18. — The magnetic-deflection method of measuring energies and momenta. 

the systems of the original assemblage are divided into two or more sub- 
assemblages each associated with a value, or range of values, of the 
dynamical variable measured. The discontinuous change in the wave 
function of a system which accompanies the measurement of a variable 
a is the reflection of a mental process in which we transfer the system 
under consideration from an initial assemblage ^ to a subassemblage Aa^y 
consisting of those members of A which have given the measured value 
a of a, all other members of A having been rejected. There is evidently 
no contradiction between the discontinuous change in the wave function 
of a system which accompanies this mental transfer and the continuity 
required by Eq. (7*2) for the wave function of a fixed assemblage of 
systems subject to definite external forces. 

The situation will be clarified by an example. Consider a measure- 
ment of electron energies by the magnetic-deflection method. Let 0 
in Fig. 18 be the source of the stream of electrons measured, let >81 be a 
slit and BB the focal plane. Diffraction effects at Si are small with the 
usual energies and slit widths so that a classical theory is adequate to 
tell us that a uniform magnetic field perpendicular to the plane of the 
paper will bring electrons with any given energy to an approximate focus 
in the plane BB provided that BB is perpendicular to the line drawn from 

^ This general type of observation has been the subject of much recent discussion. 
(C/. Einstein, Podolsky, and Rosen, “Can Quantum-mechanical Description of 
Physical Reality Be Considered Complete/’ Phys, Rev. 47 , 777 (1935); N. Bohr, Phyt^. 
Rev. 48 , 696 (1935); E. SchrSdinger, Naturivissenschaften 23 , 807, 823, 844 (1935), 
Proc. Cambridge Phil. Soc. 31 , 555 (1935); W. H. Furry, Phys. Rev. 49 , 493 (1936). 



Sec. 41] 


GENERAL THEORY OF MEASUREMENT 


329 


0 to the mid-point of >Si. A photographic plate located on this plane 
provides a possible basis for a retrospective statistical measurement of 
the electron energies. Of course, it is necessary for the success of the 
experiment that the current drawn from 0 shall be small enough so that 
the Coulomb forces between the electrons are negligible. Only in that 
case can we consider the stream to be the equivalent of a Gibbsian assem- 
blage of independent electrons. 

To convert this retrospective mciasurement into a predictive one a 
diaphragm cut by one or more slits S2 can be substituted for the photo- 
graphic plate. The electrons which pass through any of the slits then 
form a subassemblage with energies within a small ranges AE determined 
by the slit widths and n^solving powc^r. This modification of the experi- 
mental arrangement is analogous to the conversion of an optical spectro- 
graph into a monochromator. 

A first step in the quantum-mechanical analysis of the experiment is 
to note that in general the asseniblage of elec^trons leaving the source 0 
will not be of the pure-state type but a mixture of such states. We 
arcj at liberty, however, to deal individually with the component pure- 
state subassemblages. Let us concentrate our attention on onv of these 
compoiKuits, say 7, whosc^ three-dimensional wave function w(‘ designate 
as ^y{x^y,z,t). We assume that for our present purpose it is legitimate 
to idealize the diaphragms and other barriers to the motion of free 
elec^trons, treating them as infinite potential barriers which reflect all 
incident 'i' waves. ^ The interaction of the diaphragm with the electrons 
then produces no changes in which are discontinuous in time, but 
it does divide it into three parts: a portion which passes through Si 
and 82] a portion 4^7" which passes through but strikes the upper side 
of the diaphragm after magnetic deflection; and a portion ^7''' which 
fails to pass Si. (For simi)licity we assume a single slit S2 in the path 
of the downward-moving electrons.) If the experimenter desires to 
separate the electrons whi(^h pass through each slit from those which 
are blocked off, and to keep them separate, he will interpose barriers 
which actually prevent any overlapping of ^1^7', 4^7''' once these 

three portions of ^7 have been separated. From a classical point of 
view this procedure would divide the original assemblage 7 into three 
distinct subassemblages of electrons 7', 7", 7'" each located, after the 
closing of the slits, in a different vessel. For convenience we designate 
by r', r", r'" the three vessels, or portions of space, in which ^7', >^^7^^ 
are laid out. If there is no condensation on the walls, the number of 
electrons in each of these vessels would be constant in classical theory 
from the moment of closing the slits onward. Actually it is operationally 
meaningless to say that the number of electrons in each vessel is constant 
1 A mathematical model with a complex potential energy would provide for absorp- 
tion of some electrons at the walls of containing vessels, if desired. 



330 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


between the time when the slits are closed and the first time — if any — 
when the wave fumdions are disturbed by a counting operation. How- 
ever, the probable number of electrons in each vessel is constant in 
time during this period according to wave mechanics, being equal to 
[i.e.j (^7', or as the case may be. 

We now desire to show that the electrons in F' which come from the 
primary assemblage 7 may be regarded as forming a pure-state assemblage 
with the normalized wave function ’4'7'/(iV'l'7 (To be sure we do 
not know a priori the exact number of electrons in this assemblage as 
we should in the case of an ordinary Gibbsian assemblage. We 
know only that the probable value of the ratio of the number of 
electrons in the volume F' to the total number in the assemblage 7 
is This restriction is of no fundamental importance, however, 

since in practice we never know a priori the number of elements in one 
of the concrete assemblages of approximately ind('pendent systems with 
which we are forced to deal in testing th^ pix^dictions of wave mechanics 
for atomic problems.) Consider first the result of a retrospective 
positional observation made upon the electrons in F'. As these electrons 
form part of the pure-state assemblage 7, the ratio of the probable 
number of electrons in any volume element dV of F' to the whole number 
of electrons in 7 is |^7PdF, or |>k7'|\n^ Hence the probable value of the 
ratio of the number of electrons in dV to the number in all F' coming 
from 7 is |^7'|^dF/iV'^7'. This proves that, so far as positional measure- 
ments are concerned, the electrons in F' which come from 7 are equiv- 
alent to a pure-state assemblage 7' with the normalized wave function 
^7'/(iV^70’"^. But from the discussion in Sec. 41c, p. 326, we learn 
that the possibility of using a given wave function ^ to predict the result 
of an arbitrary statistical measurement is to be regarded as a consequence 
of the assumption that gives the probability density in configuration 
space. Hence it must be possible to predict the result of any type of 
statistical measurement applied to the electrons in F' by treating 
these electrons as a pure-state assemblage with the wave function 

One possible objection to the above reasoning lies in the fact that we 
have not allowed for the necessity of verifying in the case of an individual 
electron that after the slits are closed it actually has turned up in F', or 
F", or F'", as the case may be. If we were dealing with an ideal Gibbsian 
assemblage and carrying through predictive measurements of energy, 
one at a time, we should not consider our work finished until such a 
yerifi,cation had been made in each case. Tliis verification might be 
made in such fashion as to completely spoil the wave function associated 
with the future of the electron. Can it also be made in such fashion as to 
leave that function unaffected? Wc believe the answer to be affirinative, 
for it will suffice if we wish to be sure that an individual electron is in F', 



Sec. 41] GENERAL rilEORY OF MEA^^UIZEMENT 331 

to search thoroughly in r" and r'" and to verify that it is in neither of 
these vessels. Such a search — however futile in the case of a collar 
button — provides a con(*.eptual means for securing the desired informa- 
tion without disturbing the electron or changing in any way the expecta- 
tion regarding its future behavior. 

So far we have ignored all the electrons emanating from the source 
except those in the pure-state subassemblage 7. It is clear, however, 
that every pure state <j contained within the original mixture will be 
split exactly like 7. Hence the complete asscunblage of electrons passing 
through the slit >83 will behave like a mixture of subassemblages each 
of which has a wave fumrtion derived from the corresponding initial 
by a process of division and renormalization like that described above. ^ 

The experiment w^e have analyzed is a very special on(% but it will 
suffice, perhaps, to show that the assumed discontinuous change in the 
wave function associated with a system wlien one of its dynamical 
variables is measured does not violate the ruh^ of continuity implicit 
in the Sclirodinger equation (7*3). 

41e. Classical Orbits and Wave Packets. — An interesting point to be 
noted in (jonnection with the reduction of th{^ wave' paerket is tliat the 
mere intera(!tion of the observing mechatiism with the observed system 
does not make it necessary to reduce the x)acket. On the contrary, this 
interaction produces a contimious (*hang(\ If the ])acket is to be reduced, 
the interaction must have produc(^d knowledge in the brain of the 
observer. If the observer forgets the result of his observation, or loses 
his notebook, the pa(;ket is not reduced. We are again led to emphasize 
the fac.t that the wave function of a i)ure-state assemblage is merely a 
mathematical t(K)l for computing from all previous observations what 
the relative probabilities are for different results when we make our next 
observation. 

This interpretation solves the paradox of the fact that, although every 
wave packet grows in volume indefinitely with time, the motion of any 
large-scale body seems to follow a sharply defined classical orbit. The 

1 In general the wave functions of the different components of the mixed assem- 
blage in r' obtained by the process of predictive measurement will be much more 
nearly alike than the wave functions of the original mixture. Each of the new 
functions has been obtained from its parent by a common process of trimming which 
restricts the energy range in each case to AE and makes a similar reduction in the 
range of values of the vector momentum. If the measurement is performed in a 
manner which yields a good resolving power (i.c,, with a weak magnetic field, large 
radius of curvature, and slit widths of the order of the' wave length X “ /?/p) the 
reduced wave functions will be so similar that for most purposes their differences will 
be negligible. Under these circumstances the mixture in r' will behave like a pure^ 
state assemblage and can be treated as such. In fact the only way in which a pure- 
state assemblage can be prepared is by starting with a mixture and making such 
observations as are needed to reduce the wave functions of all its. elements to a com- 
mon form. 



332 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


experimental fact behind this appearance of continuity is the following. 
If we make an observation of the position and momentum of a macro- 
scopic body moving under the influence of known forces, and compute 
correctly from the data by classical methods the position which the body 
should occupy at some later time t, making due allowance for experi- 
mental error, we always find that the metusured position at the time i 
lies within the range permitted by our calculations.^ Hence we may 
consider that the paradox is removed if we can show that in such cases 
the range of uncertainty at the time t allowed by a wave-mechanical 
calculation is essentially the same as for a classical calculation. 

For simplicity let us c.onsider the case of a particle (i.e., the center 
of gravity of any body) moving in one dimension under no forces. We 
shall suppose that our observations have given us a probable initial 
position a probable momentum p (this will be constant for a free 
particle), and corresponding values of the root-mean-square deviations 
of the actual values of these quantities from their estimated values. 
We denote these root-mean-square deviations by and Ap, respectively. 
From a classical point of vieiv the values of Xqj Aa^o, p, Ap, together with 
the assumption of the Gaussian law of error, serve to define a probability 
distribution in the two-dimensional classical phase space, and so to fix 
a Gibbsian assemblage of systems having the property that the proba- 
bility of any range of values of a; or p for the individual system observed 
is the same as the probability of th(‘ same range for the assemblage. 
Thus, even classically, the prediction of the future states of an observed 
system is reduced to statistical form when the experimental errors are 
taken into account. 

For each element of this hypothetical classical assemblage 

I Pl 

x, = x« + -. 

Forming averages of each side of this equation over all elements of the 
assemblage we obtain 

Xi = xo + — p. (41*4) 

m 

To obtain the root-mean-square deviation of Xt from Xt, we express 
{xt — IbiY in terms of Xo, xo, p, p, t, and average over the assemblage. 
The resulting expression for (Aa;0^ is 

Of /2 

(Ax,)* = (x, - X,)* = (Axo)* + -iWi - P*o) + (41-5) 

7ft ffl 

Thus the classical uncertainty in the position of the particle increases with 
time by an amount which depends on the uncertainty in the momentum. 

^ The continuous motion of the image of a large-scale body on the retina of one^s 
eye may be regarded as a limiting case of the above general proposition. 



Sec. 41 ] 


GENERAL THEORY OF MEASUREMENT 


333 


From the quantum-mechanical point of view our expectation regard- 
ing the probability of different ranges of values of Xt is again to be com- 
puted with the aid of a suitably defined Gibbsian assemblage. The 
only difference is that the individual systems of the assemblage do not 
have definite pairs of values of x and p at any time. In this case the 
assemblage to be dealt with is a mixture of pure-state subassemblages, 
each of which has a definite probability Wk and a definite wave function 
The expected position of the particle at the time t is to be obtained 
by computing from each of the constituent wave functions a correspond- 
ing mean value of Xt and then averaging these over the different sub- 
assemblages. Thus, 

Xt = 'Y^WkCx[irk{x,t)\Hx. (41-6) 

k ^ 

Similarly, 

{xi - XtY^ = ^WfcJ {x - xiY\^k{xfy\Hx. (417) 

From Eq. (33T8) it follows that Eq. (41-5) applies in quantum theory as 
well as in classical theory provided that the indicated averages are 
carried out according to the rule of Eqs. (41*6) and (417), and provided 

pxo is replaced by its quantum-mechanical equivalent to take 

into account the effects of non-commutation.^ But whether we use 
classical theory or quantum theory the values of and Ap are to be 
taken from the same experimental data. Finally the value of pa^o — p ^ 
to be used in either case must depend on an analysis of th(^ experimental 
procedure actually used in measuring the initial position and momentum. 
If the measurements of position and momentum are entirely independent, 
this quantity should be set equal to zero, but, if the measurement of 
position at f = 0 is used in the calculation of momentum, this will not 

^ The calculation can be carried out as follows: Let Ak[x] denote the expectation 
value of a quantity x in the state characterized by 4'*. Thus, 


£ = ^WkAk[x] = '^WkAklxQ] -'^WkAklv] 
k 

^ Xq -jr ^p; 
fn 

(X - «)» = - #)»1 = J^w*-^*!**! - f* - *5 - 

k 

k 

~ ^ H ^p3Jo H ““ ^0* ^ — sp® 


m 

S {io - *»)* + - ?*o) + ^,(P - P)*. , 



33,4 THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 

be true. In either case, however, the uncertainty in Xt allowed by 
classical and quantum theories under given experimental conditions 
must be the same. 

Although the above discussion is based on a very simple special case 
it does show that, so long as we regard wave functions as merely tools for 
the calculation from present measurements of the results to be expected 
from future measurements, there is no real contradiction between the 
growth of a wave packet in time and the precision with which large- 
scale bodies are found by experiment to follow classical trajectories. 

42. MORE ABOUT MEASUREMENTS 

42a. Conjugate Variables and Measurements. — The impossibility of 
making simultaneous exact measurements of the Cartesian coordinates 
and conjugate components of linear momentum has already been dis- 
cussed in Secs. 16 and 336. An exact simultaneous retrospective or 
predictive measurement of two dynamical variables will be possible 
only if they are “mutually compatible’’ or have a complete system of 
simultaneous eigenfuncitions. As we saw in Sec. 37c, the mutual com- 
patibility of two variables implies their cornmutability, so that we have 
the general rule that only commuting operators are simultaneously 
measurable. The rule regarding conjugate variables is a special (‘.ase 
of this more general principle. 

One can show that the predictive measurement of any dynamical 
variable a has a tendency to dCvStroy preexisting knowledge regarding 
other variables which do not commute with a. This destruction becomes 
complete in the case of an exact measurement of one member of a pair 
of canonically conjugate variables. These are pairs of variables either 
one of which may be regarded as a coordinate, while ther other, or its 
negative, plays the role of corresponding momentum. The exact 
definition is given in Sec. 39a. Here it suffices to note that in consequence 
of the definition, if qu is one of the coordinates of a legitimate coordinate 
system gi, ^ 2 , • * * , q\i and if pk is its conjugate momentum, the eigen- 
functions of Pk in q' space have the form 

2irt , , 

(9'Ip 4 •••) = e*”*** «(?'), 

where w(g') is independent of g*'. Hence the probability function 
Q(gjfc') of Eq. (36-78) becomes 

Q(«*') = S WW, •••)!*= S 

in the ense of such an eigenfunction of pk- As this is independent of 
it follows that if pk is exactly known, all eigenvalues of qh become equally 
im>bable. . 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


335 


42b. Impossibility of Measurements Which Imply Distinction 
between Particles of Same Species. — ^The hypothesis is sometimes made 
that to every classical dynamical variable there corresponds a uniquely 
defined quantum dynamical variable. This hypothesis we hold iiu^orrect 
inasmuch as there are classical variables, such as the radial momentum 
in a three-dimensional problem, which definitely have no analogue's in 
the class of true quantum dynamical variables.^ 

One would also like to believe that whenever we have found a class 
of operators which conforms to the mathematical requirements laid down 
for a dynamical variable in quantum mechanics, there exists at least a 
conceptual experimental method for measuring the variable exactly. 
Here again, however, the attractive simple hypothesis breaks down. 
There is at least one large class of exceptions, and there may be others 
outside this class. 

We suppose matter to be composed of a number of species of ele- 
mentary particles, the members of each species being exacitly alike. 
As we noted in Sec. 4()d it is essentially impossible to distinguish experi- 
mentally between two particles of the same species, be they electrons, 
protons, neutrons, or what not. Therefore the best we can actually do, 
oven in principle, in the way of determining the configuration of an atomic 
system consisting of, say, a nucleus, which we treat as a single heavy 
particle, and / electrons, is to determine the coordinates of the nucleus 
and / sets of coordinates for the / electrons.'-^ It is not possible to dis- 
tinguish between the /! different configurations which go with the /! 
different correlations of the / electrons with the / observed i)Ositions. 
Heii(*f^ such an observation gives us a function of the configuration 
which is the same for all configurations derived from any one by a per- 
mutation of the coordinates — in other words, a symmetric function of 
these coordinates. 

Our inability to measure dynamical variables which are not symmetric 
functions of the coordinates and momenta of like particles implies an 
inability to prepare pure subjective states whose wave functions are 
eigenfunctions of nonsymmetric variables. In fact the preparation of a 
system in a state whose wave function is not in some sense symmetric 
with respect to the coordinates of like particles must be excluded as 

' In Sec. it was proved that the Hermitian operator which we should expect to 
play the part of radial momentum does not have a complete set of eigenfunctions and 
is not a true dynamical variable in quantum mechanics. Prom the experimental 
point of view we find ourselves in equal difficulty. Since the radial inopientum is not 
constant, classically, even for a free particle, unless it happens to have zero angular 
momentum, we see that the scheme for the exact measurement of P* used in 

Sec. 16 is not applicable to the radial momentum. Hence we can be sure that our 
mathematical difficulties with the radial-momentum operator are not due to an 
improper choice of that operator. 

* This might be done with a spray of high-intensity hard y rays. 



336 THE MEASUREMENT OF' DYNAMICAL VARIABLES [Chap. IX 

implj^ng an ability to distinguish between such particles. Hence we 
may properly add a corresponding restriction to the manifold of physically 
admissible wave functions hitherto identified with class Z). Evidently 
this restriction can be put in the following form: All the physical predic- 
tions which flow out of any physically admissible wave function must he 
unchanged by the application to that function of an arbitrary permutation 
of the coordinates of like particles. Otherwise we could make tests to 
check which arrangement of the coordinates is right and so acquire 
information inconsistent with the identity of the particles concerned. 

Of course the above rule can hold only if all the coordinates of each 
particle appear as arguments in the wave function, and if all are subject 
to permutation. Actually it is necessary {cf. Sec. 61a) to add to the 
three positional coordinates Xk, Vk, U of each electron and proton, a 
fourth ^'spin” coordinate aky w^hich we have not hitherto considered. 
Hence the wave functions dealt with up to this point are “iiu^omplete in 
that the spin coordinates are suppressed, and the rule we have formulated 
is not applicable to them. In the remainder of Sec. 426 the xp functions 
discussed are assumed to he of the complete type to which the permutation 
restriction does apply. 

Recollecting that the absolute phase of a ^ function is without physic*, al 
significance, we see that it is not necessary to assume that the result of 
applying any permutation of like particles Pr to a legitimate xp function is 
to leave it unchanged. On the contrary it suffices to assume that 

Prxp = C^^Txp. rjr = real constant (42*1) 

It is not sufficient to require that instantaneously \PtxP\^ = \xp\^, thus 
allowing rjr to be a function of the positional coordinates, for xp deterrniues 
probability amplitudes in momentum space and other legitimate, 
generalized, coordinate spaces whose absolute values must also be 
unaffected by Pr. Hence it is a priori reasonable to restrict the quantity 
7 Jt in the above equation to constant values. In fact Witmer and Vinti^ 
have shown that, unless it is constant, one can derive from a legitimate 
xp function others which are not legitimate by the mere process of per- 
mutation and linear combination. We therefore assume that Eq. 
(42*1) is necessary as well as sufiicient. 

The condition (42*1) implies that ^ is a simultaneous eigenfunction of 
every permutation operator involving only like particles. In the case 
of the elementary permutation operators P»/ we know that the only 
eigenvalues are ±1 (cf. Sec, 40d). Moreover, it is not possible for 

to have the value + 1 for one interchange and — 1 for another involving 
the same kind of particles. To prove this^ let us assume that Pi 2 xp =* xp; 

* E, E. WiTMBB and J. P. Vinti, Phys. Rev. 47, 638 (1935). 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


337 


Pi 3 ^ = — By inspection we see that 7^23 is cMiuivalont to both of tlie 
products PizI^itPiz and Pi^PizPi^. Accordirij^ to the first formula, ^ 
would be symmetric with respect to P 23 , wliiie the second formula would 
make it antisymmetric. Hence the assumptions lead to a contradiction. 
We conclude that every complete physically admissible wave function 
must be either symmetric = + 1 ), or afiiisyrnnictric == — 1 ), tvilh 
respect to all interchange permutatums Pij involving a single type of ele- 
mentary particle. In Sec. 636 we shall see that the experimental facts 
require the choice of wave functions which are aniisyniynctric with respect 
to interchanges of electrons and of jyrotons.^ This limitation on rompkte 
physically admissible \p functions is the ivavc-mechanical form> of the Pauli 
exclusion prmciple. 

If it were not for this exclusion principle, the proof p:iven in Sec. 15 
that ^{Xjt) is in principle an experimentally rn('asurai)le prope^rty of a 
pure-state assemblage would fail, owfing to our inability to distinguish 
between configurations whi(di differ from one anolh(*r only by a j^er- 
mutation of like particles. In view of the exclusion principle, however, 
we know that \t'\^ has the same value at all such points of (*onfiguration 
space and can determine its value at any individual point from the 
measured value of the sum over the set of points obtained from the 
original one by permutations. Hence d\^'\^/dt and ultimately ^ 
itself are experimentally determinable. 

Let us now return to the question of symmetric and nonsymmetric 
dynamical variables. Consider an oj)erator a which is given as an explicit 
function of the coordinates Xi and momenta pi of the individual particles: 

a = a:2.P2; • • • )• 

Let the function / have such a form that 

Prfi^ljVl] • * * ) — hj'nz] ’ ' ’ )j 

wnere Ji, rp are numbers substituted for the operators Xi, p,-. In this 
case we say that a. is symmctHcal. • Every oi)erator wliic^h satisfies this 
requirement must commute with all such permutations Pr, It follows 
that, if ypn is an eigenfunction of a with the eigenvalue a«, Pryl/n is another 
eigenfunction with the same eigenvalue. 

Every Pr can be regarded as the product of a number of interchange 
operators Piy, The resolution into these elementary factors is not unkpie, 
but the number of factors is always even, or always odd. We accoj*dingly 
defi.ne a permutation as even or odd, according as it is composed of an 
even or odd number of interchanges. Let the index pr be defined as 
zero, if Pr is even, and, as unity, if Pr is odd. Denoting the number of 

^ Cf. W. Heisenberg, Zeits.f, Physik 38, 411 (1920), 



338 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


like particles (electrons or protons) as before by /, we consider the 
operator^ 

g (42-2) 

T 

The sum is to be extended over the /! distinct permutations of the / 
particles. 

If we multiply any two permutations together we get a third per- 
mutation — this is the fundamental group property. Moreover, if 
Px = PfcPr, P\' = PkPr y and Pt Pr'i it follows that Px 7 ^ Px'. Simi- 
larly PrPk 5^ Pr'Pk- Therefore, if we multiply each of the /! permuta- 
tions in turn by any one, either as a prefa(itor or a postfaetor, the products 
comprise all /! permutations. If Pk is an even permutation, PkPr is 
even or odd according as Pt is e?ven or odd. If Pk is odd, PaPt is even 
when Pr is odd and vice verso,. If we multiply 9 by any permutation P^, 
each term is converted into one of the other terms with its sign changed 
if Pk is odd, but with its sign unchanged if Pa is even. Hence 

PkQ = QPk = 

It follows at once that either vanishes or is an eigenfunction of every 
permutation operator Pk with the eigenvalue? +1 if Pk is even and the 
eigenvalue — 1 if P* is odd. If it does not vanish^ it is antisymmetric 
with respect to every one of the interchange permutations Pij. 

If the function <t> is antisymmetric with respect to all the interchanges 
Piy, it is unchanged by the application of any even permutation and 
simply reversed in sign by any odd permutation. Consequently 

r»/! 

r=»l 

Thus every antisymmetric function is an eigenfunction of g with the 
eigenvalue +1. It follows that 

8(9^) = 9^ (42-4) 

and that 

9(^ ~ 9^) - 9^ - 9^ == 0. (42*5) 

Hence every function of the coordinates of f particles can be resolved into 
the sum of two terms, viz., g^ and (^ — g^), one of which is antisymmetric 
and an eigenfunction of g with the eigenvalue +1, while the other is an 
eigenfunction of g with the eigenvalue 0. 

There is no linear combination of permutations of ^ which is anti- 
symmetric and linearly independent of g^. To prove this we awssume 

^ Except for the normalizing factor, g is the same as the ‘^antisymmetrizer’^ of 
Gohcbn and Shortley, T.A.S., Chap, VI, Sec. 3*. 



Sec. 42] 


MORE ABOUT MEABUREMENTB 


339 


that the function g\l/ = '^CrPri i« antisymmetric. Then = gyp. 

T 

Hence = gyp. 

Thus g\l/ is a multiple of g^. TTe cmiclude that Qip and its multiples are 
the only antisymmetric functions that can he formed from the linear manifold 
of the permutations of 

If a is any dynamical variable which commutes with all the per- 
mutation operators, it commutes with g. Consequently there exists a 
complete set of simultaneous eigenfunctions of a and g. Every anti- 
symmetric function <^, benng an eigenfunction of g, can be expanded in 
terms of antisymmetric eigenfunctions of a . . Thus the experimental 
resolution of an arbitrary, physically admissible, pure-state assemblage 
into subasscnnblages whose wave functions an^ eigenfunctions of a 
involves no violation of the exclusion principle. We conclude 

that every dynamical variable which commutes with every Pr can he reckoned 
as symmetric in the coordinates of like particles and as measurable in 
principle. On the other handy it is evident that dynamical variables which 
do not commute with every permutation Pr cannot commute with g and 
cannot be measured without violating the Pauli principle. The symmetric 
dynamical variables thus defined form the most restricted class of 
operators concerning which we can say that there is no a priori reason 
why they should not be measurable. Hence we shall refer to them as 
observables y using the latter term in a narrower sense than that of Dirac. 

If the observables <* 1 , a 2 , «s, • • • form a normal commuting set of type 1 operators,^ 
it will be possible to expand any i)hysically admissible wave function into simultaneous 
eigenfunctions of the set, each of which satisfies the Pauli principle. Such a set of 
type 1 observables will be called co7n’plete when any two antisyimnetric simultaneous 
eigenfunctions with the same eigenvalues are linearly dependent {cf. Sec. 37d, p. 287). 
In place of the requirement for type 1 operators introduced on pp. 249 and 254 
(Sec. 36), we now postulate that every type 1 observable can be united ivith one or more 
additional observables of the sanu variety to form a complete normal commuting set. 
The formulation of a corresponding postulate for observables which are not of type 1 
can be left to the reader. 

As g commutes with every allowable Hamiltonian function 11 , it 

commutes with the time-displacement operator e * {cf. Sec. 326 , 
p. 201 ). If ^{XyU) is antisymmetric, it follows that "^{XyU + t), ix,, 

2mtH 

e ^ ^{Xytif}, is also antisymmetric. Consequently a wave function 
which satisfies the Pauli principle at any instant U must satisfy it at 
every other instant. 

Some remarks concerning the observation of the configurations of 
macroscopic bodies are apropos at this point. When one notes the posi- 

^ From the Dirac point of view the restrictions to operators of type 1 is unnecessary. 



340 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


tion of a chair he is clearly observing a function of the coordinates of the 
particles that compose it, which is syniinetrical with respect to all per- 
mutations of like particles. One takes the continuity in time of the 
image of a chair on his retina as proof that the final image and the initial 
image refer to the same object. The existcuKje of moving pictures shows, 
however, that the apparent continuity of a succession of images is not 
proof of their correlation with a single object. If there were many 
identical chairs in a room, an instantaneous permutation of the coordi- 
nates of these chairs would be unobservable. And, even granting the 
uniqueness of any particular macrosc^opic object, such as a chair, we 
can have no evidence that the elementary particles which compose it at 
the time t' are the same as at some other time i". Thus the concept ^)f 
id(‘ntity rests on continuity of experience and ceases to have meaning 
when experience is esscaitially discontinuous. In our experiments 
with atomic systems, however, discontinuity of experience becomes the 
rule ratlu'r than the exception. Iif fact, it can be shown that continuity 
of exi)erience is im})ossible with reference to observations made on the 
electrons in the low energy states of atoms and that the concept of the 
identity of the electrons becomes operationally meaningless for this 
class of experiments. 

In order to obtain an approximately continuous set of atomic-configuration 
observations it would bo necessary to make positional observations without disturbing 
the momenta by amounts wdiich are appreciable in comparison with tlieir observed 
values. A measurement of configuration to be used for the identification of the 
electrons must involve an uncertainty Aq in the position of each one which is small 
compared with the average distance between electrons and much smaller than the 
effective atomic radius, which we denote by a. 

Let T denote the mean kinetic energy of a helium atom. Let p\ and pa be, the 
momenta of the electrons and let p be their mass. Then 


Pl_* , P2 
*Zii 2ii 


It follows from the virial theorem (c/. p, 506) that the average kinetic energy is the 
negative of the total energy. Hence the mean square of eaclj component of linear 
aE 

momentum is Let Ap be the root mean square of the momentum 

perturbation due to the measurement of the conjugate positional coordinate. It 

should be small in comparison with —{iiE/ij)H if successive observations are to reveal 

( 1 1 \ c* 

— I — j -I I^g niean 

ri Ti/ ri2 

3c® n 

value must be of the order of — By the virial theorem, V =« 2E, Thus the 
observation of a continuous orbit implies that 

=(-2g)(-6) ' 

whereas the Heisenberg uncertainty principle requires that ApAq ^ h/Av, Putting 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


341 


these two requirements toj^ether we see that an experimental continuous orbit can 
exist only if —E ( < <)* SRh^ where R is tlie Rydberg constant. The actual value of 
—Ej however, is of the order of ^Rh. 

42c. A Classification of Observations. — It is normally j)ossil)]o to 
distinguish between a primary and a secondary observing meelianism. 
The function of the latter is to detect the eff(H*t of the interaction between 
the observed system and the primary mechanism. If the dynamical 
variable under observation is conserved during the period of interaction 
with the primary obse^rving metthanism, this period can be as long as 
desired. If it is not so conserved, the time of the primary interaction 
must ordinarily be brief. This is true, for example', in observations of 
position and momentum by collision. In measurements of momentum 
by the basic method of Chap. II, Sec. 15, the primary observing mecha- 
nism must neutralize any force field acting on the particle under obseu’va- 
tion, and th(^ momentum is thus conserve'd during the long tinier which 
must elapse between the mcjasurojm'iits of position. 

Observations can be divided into two classes according as the result 
is made to hinge on the ('ffect of th(^ ])rimary int('ra(‘tion on sysl-tun 
observed, or on the primary obs(‘rving mechanism. A spcM'troscopio 
observation, in which the primary obs('rving mechanism consists of a 
slit, a pair of lenses, and a prism, Ix'Iongs to the first (dass. We are 
interested here in the effect of the ])rimary iriteraction on the photons 
to be measured, and not, for example, in the monu'ntum which the 
prism recc'ives from the light which passes through it. Similarly, in 
the case of the Stern-Gerlach experiimuit, where we have to do with a 
stream of atoms interacting with a magnet producing an inhomogemeous 
field, it is the effect of the interaction on the atoms observed wdiich is 
important. On the other hand, when we nu^asure the position of an 
object with an optical device, or when we measure velocity in the line 
of sight by the Doppler effect, w'^e conc(uitrate our attention on the 
effect of the interaction on the radiation which constitutes the primary 
observing mechanism. It is evident that wdienever the observed system 
and the primary observing mecdianism have very unequal masses the 
result of the observation must hinge on the behavior of the lighter of the 
two. 

In case the masses of the system observed and the primary observing 
mechanism are not very different a complete quantum-mechanical 
theory of the experiment must treat them as different parts of a single 
system during the period of interaction, although they musi be well 
separated in the beginning and at the end. If the two masses are quite 
different, however, we can usually describe the interaction by means of a 
perturbing term Hi in the Hamiltonian of the lighter pair of interacting 
systems. This amounts to dealing with the heavier system on a classical 
basis. 



342 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


A particularly important variety of observation for quantum mechan- 
ics is that in which the primary observing mechanism is very massive 
in comparison with the system observed and in which the observable 
to be measured is conserved during the period of the primary interaction. 
We shall designate this variety of observation as of type A. Measure- 
ments of type A can be repeated because the property to be measured is 
not destroyed by the interaction. 

Let the Hamiltonian for the period of interaction in a type A observa- 
tion on a be H == J?o + Since a is conserved, it follows that 

Ha - all = 0 . 

Let ^ denote the wave function for the assemblage of systems under 
observation, or for an hypothetical assemblage prepared like the system 
to be measured, if there is only one. This function experiences a trans- 
formation during the period of primary interaction which we can 
symbolize by means of the time-displacement operator T. Thus 

= ^to4At, 

where the unitary operator T has the explicit form 



n ““O 


(c/. Sec. 326, p. 201 and Sec. 426, p. 339) and is thus a function of the 
perturbed Hamiltonian H, It follows that T like H must commute with 
a* 

The effect of the primaiy interaction, or of the equivalent operator T, 
is usually to separate the subassemblages in space, so that a subsequent 
positional observation by tiie secondary observing mechanism is sufficient 
to determine the value of a for any individual system. 

The magnetic-deflection method of rrioasiiring energy discussed in 
Sec. 4 Id is an example of a type A measurement in' which the magnet, 
which produces the field, and the diaphragm are the essential parts of 
the primary observing mechanism. The measurement of e/m with a 
mass spectrograph, the Stern-Gerlach experiment for the measurement 
of atomic magnetic moments, and spectrum analyses by prism and 
grating are other examples of the same nature. In all these cases a 
massive primary mechanism interacts with a stream of atomic systems 
(including photons), in such a way as to spread the latter out into a 
spectnim where a subsequent positional observation completes the 
experiment. The measurement of the momentum of a free particle by 
the method of Sec. 16 constitutes a variant of this type of observation 
in which no primary observing mechanism is needed. 

42d. Measurements as Correlations. — In the above mentioned 
experiments the variable to be measured is constant for the unperturbed 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


343 


system so that tho time at which the observation begins need not be 
specified. On the other hand, if we have to do with a quantity a which 
is not an integral of the natural motion, e.g., the x coordinate of a particle 
whose average velocity-component along the x axis is not zero, we must 
usually specify the time at which the measurement is made. In other 
words the observation normally consists in a correlation of two different 
quantities, say a value of a and a time. Sometimes, hoAever, the 
measurement can involve a correlation of two different variables, say x 
and neither of which is the time. 

As an example consider the process of passing an electron or a stream 
of electrons through a slit or other aperture in a diaphragm. This is a 
measurement which approximately defines a point in the orbit of each 
particle passing through the aperture. The interaction between the 
systems under observation and the primary observing mechanism 
divides the initial' assemblage into two subassemblagos with mutually 
orthogonal wave functions and consisting respectively of those systems 
which pass through the aperture and those which strike the diaphragm. 
We can regard the observation as one which fixes the value of an observ- 
able ^ whose eigenfunctions are of two types, viz,j those which vanish 
with their gradients over the aperture and those which vanish with their 
gradients over all other parts of the plane of the aperture. These eigen- 
functions can be associated with arbitrary corresponding eigenvalues if 
desired, but the eigenvalues are of no importain^e in connection with such 
^^yes^^ or '^no observations as this. The experiment can be classified as 
a type A observation since the interaction of the particles under obser- 
vation and the diaphragm can be represented by a perturbing term in 
the Hamiltonian of the former, and since the variable measured is in a 
sense conserved (so that a second diaphragm like the first and pla(;ed 
beside it would not affect the result). It is hardly necessary to remark 
that as the aperture of the diaphragm shrinks in size, the observation of 
f becomes an arbitrarily exact measurement of the positional coordinates 
for the subassemblage of atomic systems which pass from one side of 
the diaphragm to the other. 

42e. The Observing Mechanism Not Entirely Classical. — Consider 
next the more general type of observation in which the mass of the 
primary observing mechanism B is not large compared with that of the 
observed system A, and in which both A and B must be treated on a 
quantum-mechanical basis. Observations of the position, energy, and 
momentum of atomic systems by eleiitron or photon impact are examples. 
In measurements of this type the systems A and B are allowed to interact 
and a subsequent measurement of B is interpreted with the aid of one 
of the conservation laws as a measurement of A. 

In such cases the classical point of view demands the eadstence of a 
definite history of events which can be divided into three periods, viz.^ 



344 


THE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 


tho time before interaction, the time after the interaction, and the time 
during wliielf the interaction takes place. A similar historical sequence 
obtains in quantum theory provided that the times in which the systems 
A and B an^ j)re]mi'ed and measured do not overlap each other or the 
time of interaction. In practice, experiments of this kind are usually 
made in such a manner as to spoil this historical sequence, although an 
equivalent spatial secpience takes its place. Thus, in measuring atomic 
energies by electron impact we allow a steady stiH^am of electrons of 
known energy (^merging from a slit to pass through a chamber containing 
low-density gas and observe the absorption of energy by reading micro- 
ammeters which give the currtmts t.o various ele(*trodes in the observation 
chamber. We have to do with a state df kinetic equilibrium in which 
the preparation of the iiuddent stream of electrons B and tlie observa- 
tion of their energies after imi)a(*t are both continuous processes over- 
lapping in time. We distinguish between the electrons which have not 
yet int('racted with the atoms, and those which have interacted, by their 
positions rather tlian by the clock. This fact is clearly unessential, 
however, aiid, since the experiment could be carried out with shutters 
so as to establish a time sequence, we simplify the disemssion by assuming 
that it is actually done in that manner. 

Let us further assume for the present that the preparation of the 
system A to be observed and of the system B whi(di constitutes the 
primary observing mechanism is such as to leave each of them in a 
pure subjective state before the interaction. We then have definite 
initial wave functions for the separate systems prior to 

interactions. Here ^ denotes the totality of the coordinates of A, 
and rj the totality of the coordinates of B, It follows that during this 
period of time the combined system is in a pure state with the wave 
function ^ = 4>(?,0 X(tj, 0. No other assumption is consistent with 
our kndWledge of the independent systems. 

The Hamiltonian function of the united system will be of the form 
H — Ha + Hb + H', where the last term denotes the mutual energy 
which gives rise to the interaction. The complete wave function ^ 


is transformed in time according to the rule = 


h 

dt 


So long as 


the systems are known to be far apart, H' plays a negligible role and the 
assumption A' = 4>X is permissible. During the period of interaction, 
however, spoils the factorization — i.e.j ceases to be a solution of 




A^. 

2Tn dt 


After the interaction we can only say that ^ is 


expressible as a sum of products of the form 




( 42 - 6 ) 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


345 


where the and X^s are orthonormal sets of solutions of the Schrodinger 
equations of A and 5, respectively, and the are constant.^ 

It follows at once that, although the united system is left in a pure state^ 
the states of A and B taken separately are mixed. To establish this fact 
we compute the mean value of a dynamical variable a which ^l)elongs to 
^.e., which is applicable only to functions of the f coordinates. Let 
the wave function of the united system after the interaction be expressed 
in the form 


^ (42-7) 

m 

where 


Cn,UM,t) = (.U^,U^) = 1. (42-8) 

n 

The functions VmUd) defined by (42*8) are not orthogonal in general, like 
the X’s. Then 


S = = X\C4%aU^,lU)i, (42-9) 

m * 

where the first scalar product is extended over the coordinate space of the 
unitcul system while the scalar i)roducts which follow the summation 
sign are extended over tlui (H)()rdinate space of A alone. Thus the expecta- 
tion value of the arbitrary operator a is obtained either by averaging over the 
pure-state assemblage of united systems, or over a mixed assemblage of 
observed sy sterns A in which the probability of the wave function Um is 

\Cm\^ = ( J [cf, Eq. (4M)]. Although the A systems 

\n n 

have ceased to interact with the B systems, a is not obtainable by treating 
the A systems as a pure-state assemblage with any assignable single wave 
function ^{^,t). 

As Schrodinger has pointed out,^ the subjective state of the united 
system as defined by its function does not give us sufficient information 
to define a corresponding subjective state for each of its independent 
parts. This is due to the fact that much of the informa-tion regarding A 
contained in* the Sk of the united system has to do with the correlation of 

‘ Here we use the notation for purely discrete spectra, replacing the continuous- 
spectrum eigenfunctions with an approximating set of discrete functions as suggested 
on p. 163. To prove the validity of Eq. (42-6), we assume that the 4»’s and X’s form 
complete normal orthogonal sets of eigenfunctions of Ha and Hb, respectively, with 
appropriate time factors to make them solutions of the corresponding time dependent 
Schrodinger equations. The product functions will then form q. complete set in the 
coordinate space of the combined system. Neglecting H' we see that the right-hand 
member of (42-6) is a solution of the Schrodinger equation for the combined system 
when the Cnm's are constant. 

> E. ScHRODiNOEB, loc. cU., Sec. 41d, p. 328. 



346 7 HE MEASUREMENT OF DYNAMICAL VARIABLES [Chap. IX 

values of the coordinates of A with those of B and is therefore conditional 
rather than absolute. 

It is nevertheless possible to restore the system ^ to a pure state by 
making observations on B without directly disturbing the system A 
itself. For this purpose we must make what is called a complete predic- 
tive observation of S, i.e., a simultaneous predictive observation of a 
complete set of commuting dynamical variables. Such an observation 
determines the wave function of jS, regarded as an independent system, 
uniquely except for a physically meaningless phase factor. Further- 
more, it insures that at the moment when the observation is completed 
the wave function of the combined system shall have the form 

(42*10) 

It follows that so long as there is no further interaction between tln^ 
two systems, ^ must remain factorable into a product like that which 
described the combined system before the first interacjtion of A and B, 
Hence an expansion of A' like (42-7), after the observation of will 
have but a single tqrm and A is accordingly left in a pure subjective 
state. 

Let us now consider the circumstances under which observations of a 
dynamical variable which belongs to made after the interaction of A 
and By are equivalent to observations of a dynamical variable a char- 
acteristic of i4. In such cases the variable /5 can be measured after the 
initial interaction without further disturbing the system A in any way. 
As examples the observations of energy, momentum, and position of 
elementary particles, or of atomic systems, by impact may be mentioned. 
The possibility of such measurements depends upon the initial prepara- 
tion of A and 5 in a special way which |*ermits us to replace the expansion 
(42*6) by one of a more special character. 

We replace the functions 4>n($,0 by a complete orthonormal system of 
eigenfunctions of a, say v?nr(f). Here the first index n fixes the eigenvalue 
of a and the second fixes the eigenvaluejs of such other dynamical variables, 
as must be added to a in order to make a complete set for the system A. 
Similarly we replace the functions Xm by a complete orthonormal system 
of eigenfunctions of d, say The wave function of the combined 

system, after interaction, can always be written in the form 

’^'({,’1,0 = 8, 0¥>nr(f)Xm.()?). (4211) 

Let us now suppose that all the coefficients cinyTymySyt) vanish except 
those allowed by the assumption of a functional relationship between 
n and m. This will be true only for some special way of preparing the 
systems A and B With ihis assumption, however, we can discard the' 



Sec. 42] 


MORE ABOUT MEASUREMENTS 


347 


index m in favor of the index n which determines it. The expansion 
then becomes 

(42-12) 

n,r,8 

A predictive measurement of /3 now reduces ^ to the form 

4> = C'^c{n,r,s,t,),pU^)xn.iv), (42-13) 

r,« 

where C is a normalizing factor. This is an eigenfunction of a as well as 
of i3, and hence the predictive measurement of j8 is also a predictive 
measurement of a. It does not leave the system i4 in a pure subjective 
state unless a is nondegenerate, but leaves it in a mixture of subjective 
states all of whi(;h are eigenstates of a with a common eigenvalue an. 
The probability that a retrospective measurement of $ for a system in the 
subjective state (42-12) will give the value Sn is equal to the probabil- 
ity that a retrospective measurement of a will give the value a„, viz., 

]^|c(n, r,s,<) I*. Hence a retrospective measurement of |3 is equivalent to 

r,« 

a retrospective measurement of a. 

As regards actual schemes for preparing the systems A and 5 so as to 
reduce the expansion (42-7) to the form (42-12), we consider only a 
special case. Measurements of energy and momentum by impact 
methods are based on the correspoirding coaservation laws. Let a 
and /S be the energies of A and B, respectively. If a and have unique 
values before impact, their sum has a unique value E before and after 
impact. All terms of (42-11) vani.sh except those fur which a„ -|- jS*. 
is equal to E. In the typical case, A is an atom initially at rest in its 
normal state and B is an electron shot out of an electron gun with known 
initial energy. By measuring the energy loss of the electron we are 
able to determine the possible increments in the energy of the atom. 



CHAPTER X 


MATRIX THEORY 
43. MATRIX ALGEBRA! 

Matrix algebra constitutes an im})ortant aid to the study of problems 
in quantum mechanics and in this section we shall briefly review the 
elements of the subject, reserving applications for discussion in Secs. 
44 and 45. 

Let A, or ||A(m,n)||, denote the matrix, or two-dimensional array of 
numbers^ 



A(l, 1) 

.4(1,2) . . . 


A (2, 1) 

A(2,2) . . . 

A s \\A{m,n)\\ = 

. . . 



If the number of rows and number of columns are both finite we say that 
the matrix is finite. Otherwise it is said to be infinite. Matrices with 
an equal number of rows and columns are said to be square. Matrices 
having an infinite number of rows and of columns will be referred to as 
infinite square matrices. 

If a matrix A has p rows and q columns we shall say that it is a p X g 
element matrix. Two matri(;es are said to be similar if they have the 
same number of rows and the same number of columns — the definition 
applies even if the number of rows, or the number of columns, or both, 
are infinite. Only similar matrices can be added and subtracted. 

To add or subtract two similar matrices A and B we add or subtract 
corresponding elements. Thus 

{A + B)(m,n) ~ A(m,n) + B{myn). (43 '1) 

A matrix, all of whose elements are zero, is called a zero matrix and is 
denoted by 0. Two matrices A and B are equal if, and only if, all their 
corresponding elements are equal; i.e.j if A — B == 0, The commutative 

! C/. M. B6cheb, Higher Algebra^ or Born and Jordan, Elementare Qrmnt & nr - 
niechanik. The author is indebted to the latter text for many suggestions followed in 
the constniotion of this section. 

* The first of the indices (m,n) indicates the rote, and the second indicates the 
column. The catch word ** Roman Catholic” (R.C.) is helpful in remembering this 
fundamental convention of our notation. 

348 



Sec. 43] 


MATRIX ALGEBRA 


349 


and associative laws of addition in ordinary algebra apply to the addition 
of similar matriccis. * 

Let A be a p X q element matrix and let B be a p' X q' element matrix. 
If (/ == p' we define the product AB by the rule 

(i4B)(m,n) = ^A{m,k)B{k,n). (43-2) 

k 

Evidently AB will be a p X element matrix. Finite similar square 
matrices can be multiplied in either order and yield products similar to 
the factors. Infinite square matrices can be multiplied in the same 
way if the infinite series obtained from the rule (43-2) are convergent. 
Our chief interest is in such similar finite or infinite square matrices and 
the rest of the following general discussion of matrix algebra refers 
primarily to them. 

On the basis of Eq. (43-2) it can readily be shown that the assotdative 
law of ordinary algebra applies to the multiplication of matrices, although 
the commutative law does not. Thus, in general, 

AB 5^ BA. 

The product of a matrix A and an ordinary number (scalar) c is the 
matrix whose typical element is 

{cA){m,n) = c.d(m,n). 

A diagonal matrix is a square matrix all of whose elements vanish 
except those on the principal diagonal. Thus, if D is a diagonal matrix. 



i>l 

0 

0 . . 


0 

Di 

0 ... 

D = WDndmnW = 

0 

0 

D, . . . 


If the nonvanishing elements of the diagonal matrix D are real and 
arranged in order of ascending algebraic value, the matrix is said to be an 
ordered diagonal matrix. 

A unit matrix I is a diagonal matrix all of whose elements are unity. 

I = \\Sn.n\\. 

It follows at once that 

AI = lA = A. 

The inverse, or reciprocal, of a square matrix A is indicated by the 
symbol A~‘ and is defined, in case it exists, by the relations 


AA-‘ = I, 


A-iA = I. 


( 48 - 3 ) 



350 


MATRIX THEORY 


(Chap. X 


All integral powers of A can now be defined by the rules 

Ap = AAA • * • to p factors, 

A-P = (A”l)^ 

A« - 1. 

With these definitions we can establish the usual exponential rules 
A^A*' = A^+^ (A^)*' == A^^ 

Every finite square matrix A has a corresponding determinant which 
we call det A. It follows from the rule for the multiplication of deter- 
minants that 

det AB = det A • det B. (43*4) 

As the determinant of a unit matrix is unity we conclude from Eqs. (43*3) 
and (43*4) that if A is a finite square matrix with a reciprocal A~^, 

det A 9^ 0, 

Conversely, if A has a nonvanishing determinant, we can solve either 
of the equations (43*3) for the elements of the reciprocal A~^ In either 
case, one obtains = a(rn,n)/det A, where a(m,n) is the (;ofactor 

of A(m,n) in the expansion of det A. Hence for such matrices A~^A = I 
implies AA“^ = I and vice versa. There is no similar simple rule for 
the existence of a reciprocal to an infinite matrix. Moreover, the rule 
A^^A = I implies AA-^ = I is not valid in general for infinite matrices. 

There are two classes of matrices with symmetry properties of special 
importance for the quantum theory, viz,, the Hermitian and unitary 
matrices. In order to introduce these classes we define the adjoint 
matrix to A as a matrix obtained from A by replacing each element by 
its conjugate complex quantity and then interchanging rows and columns. 
We denote the adjoint of A by At. Thus 

At(m,n) = A(n,m)*. (43*5) 

An immediate consequence of this definition is that 

(A + B)+ - At + Bt. (43*6) 

Furthermore, 

(AB)t(m,n) - (AB)(n,w)* = ^A{n,kyB(Je,m)* = 

* k 

or 

(AB)t = BtAt. (43-7) 

Definition: A Hermitian, or self-adjoint, matrix is a square matrix 
wbidk is equal to Us adjoint. It* conforms to the equation 

At = A, 


(43-8) 



Sec. 43] 


MATRIX ALGEBRA 


361 


and every element is equal to the complex conjugate of the element 
symmetrically situated with respect to the principal diagonal of the matrix. 

From (43-6) it follows that the sum of two Hermitian matrices is 
Hermitian. Equation (43*7) shows that the product of two Hermitian 
matrices is Hermitian if, and only if, the matrices commute. On the 
other hand the symmetrized product 3^^(AB + BA) is always Hermitian 
if A and B are Hermitian. 

Definition: A matrix U is said to be unitary if 

UUt = UtU = I, or Ut = U-^ (43-9) 

In general the sum of two unitary matrices is not unitary, but we see 
from Eq. (43-7) that if U and V are unitary ^ the products UV and VU are 
also unitary. 

Another immediate consequence of Eq. (43-7) is that if A is Hermitian 
and U is unitary^ the matrix B defined by 

B = U-^AU (43*10) 

is also Hermitian. 

A finite unitary matrix has a determinant of unit absolute value. To 
prove this statement we note that det XJt = (det U)*, since interchanging 
the rows and columns of a determinant does not affect its value. Hence 

|det up = (det U)(dct U)* = (det U)(det Ut) = det (UUt) = i 

If all the elements of a square matrix are zero except those which lie 
in a set of consecutive non-overlapping squares spread out along the 
principal diagonal, the matrix is said to have step form. Such a matrix 
is illustrated in Sec. 48a, p. 390, Fig. 20. A step matrix can also be 
regarded as a diagonal matrix each element of which is a square matrix. 
Thus we write 


AO) 

0 ... 

0 

4«) . . . 


denoting each of the diagonal squares by a corresponding symbol A^^K 
If A commutes with an ordered diagonal matrix B = |l?)n5nm||, 

(AB)(n,m) — (BA)(n,m) = A(n,?n)[6«» — bn] = 0. 

Thus A (n,m) must vanish unless bn = and A is therefore a step matrix, 
in which each step is correlated with one value of bn- 

The determinant of a step matrix is evidently equal to the product of 
the determinants of the individual steps. 

Two step matrices are said to be similar if each step of one has the 
same number of rows and columns as the corresponding step of the other. 
It is not difficult to see that the product of two similar step matrices 
A and B is a step matrix similar to A and B. 



362 


MATRIX THEORY 


[Chap. X 


If a unitary matrix is of step form, it follows from (43*9) that each 
step is a unitary matrix. Consequently the determinant of each step 
of a unitary matrix is of unit absolute value. A diagonal matrix is 
a limiting case of a step matrix in which each step contains a single 
element. 5 Hence the diagonal elements of a diagonal unitary matrix 
are of unit absolute value. 

44. MATRICES AND OPERATORS 

44a. The Derivation of Matrices from Operators. — The algebra of 
square matrices obeys the same fundamental laws as the operator 
algebra of Sec. 37. Hence it is not surprising to find that we can estab- 
lish a definite correlation between linear operators and corresponding 
matrices. In fact operators and matrices give two different realizations 
of a common abstract mathematical framework. In. Secs. 22d,c (pp. 
116 to 120) we indicated the correlation between the class of functions 
quadratically integrable over a certain domain and the class of all 

complex vectors of infinitely many components ai, a 2 , • * * which yield 
00 

convergent sums Just as an operator a can be used to transform 

t = 1 

a function into another function a^, so a matrix ||A(/c',fc)|| can be used to 
transform a vector (xi, X 2 , • • • ) into another vector by the equations 
00 

Vk = ^Xk>A{y,k). A; = 1, 2, • • • 

To establish the correlation between operators and matrices we 
introduce a linear operator a which has an adjoint with respect to a 
linear manifold of functions C and a domain of integration M in coordinate 
space (c/. definition, Sec. 32d, p. 203). Let (p\{x), (P 2 (x), • • * be a 
complete orthonormal sequence of functions in C. In conformity with 
the formal expansion 

so 

a‘Pkix) = XvMk',k), (44-1) 

*'-l 

we define an infinite square matrix a = ||a(fc',fc)|| by the formula 

a(k'jk) = J^(Pk'*a(pkdr == (a<pkf<Pk'X (44-2) 

IK is said to be the matrix representation of the operator a with respect 
to the given system of basic functions ^ 2 , * * ‘ • Different basic 
S 3 ^tems of functions yield different matrix representations, but, to avoid 
circunilocution when but one such system is under consideration, we shall 
refer to the corresponding representation of a as the matrix of a. 



Sec. 44] 


MATRICES AND OPERATORS 


353 


The matrix of the operator adjoint to a is readily seen to be the 
adjoint of the matrix a. Thus 

a^{k'jk) = == (jpk,ot(pk>) = 

Hence Hermitian operators are corndated with Hermitian matrices. It 
follows from the assumed quadratic integrability of aipk that the series 

^\oc(k\k)\^ is convergent for every k.^ As a^tpk is also quadratically 

k' 

iiitegraV)le, we infer that is (*onvergent for every k. 

k‘ 

Let \p and ^ denote, respectively, a function which belongs to C and the 
vector form(id from its Fourier coefficients == The Fourier 

coefficients of a\l/ with resy)ect to the base functions ^ 2 , * * ‘ are 
given by {a\l/j(pk) — In view of the completeness of the sequence 

of <^^s the scalar product of any two functions over the domain M is equal 
to the scalar product of the corresponding vectors. Thus we obtain 

(a\l/,<pk) = ^OL{k,n)^n. (44-3) 

71 

It follows from this formula that n7iy two operators a and a' having the 
same matrix a ca7i he idoitified for physical purposes. For, so long as we 
stick to the manifold C, (aip ~ a'^|/, (pk) = 0 independent of the choice of 
^ and of k. This means that cop and a'yp are physically ecpiivalent 
(c/. Sec. 36d, p. 256). Equation (44-3) and the postulated (juadratic 
integrability of axp and also imply that the sums 

k k n k k 71 

are convergent. 

From the uniqueness of the correlation of inatricea and operators we arc led to 
infer that matrices can be used to define operators just as operators define matrices. 

Let us therefore assume that we have given an arbitrary matrix ^ such that 

k 

and are convergent. Let us further sui^pose that we have given a vector 

in Hilbert space 17 « (171, 172, • * • ) such that 

n k n k n 

^ This implies by the inequality of Schwarz (Sec. 22, p. 116) that ^^,a{k\k)a{k' fc”) * 
is convergent for every pair of values of k and • 



354 


MATRIX THEORY 


[Chap. X 


are convergent. We know from the Fischer-Riesz theorem^ that there exists an 
essentially unique quadratically integrable function y(x) such that (yj<pk) * Vk, 
for all k. There also exist quadratically integrable functions u{x)f v{x) such that 

(Uyipk) == ^Vn^ik.n); (v,<pk) ~ 

n n 

If we identify u and v with 0 y and /Stj/, respectively, we define operators jS and 
which are adjoint with respect to a linear manifold of functions which includes 
y(x) and the seciuence of ^^s. 

If the ^'s arc physically admissible wave functions, the diagonal elements of a 
are by their definition the mean values of a for the states which the associated func- 
tions describe. 

If the basic functions <piy<p2, • • • of a matrix scheme are eigenfunctions of a, 

a{h',k) = {aif>k,(Pk’') ~ dkhk', 

where a* is the eigenvalue of a for tpk. Thus the matrix of « in this scheme is diagonal. 
The requirement that the shall be simultaneous eigcTifunctions of a complete set 
of independent commuting dynamical variables, say aj, a2, • • • , determines the ^’s 
uniquely except for physically meaningless phase factors- provided, of course, that 
the operators ai, a2i • * • have discrete spectra. Hence it is frecpumtly convenient 
to designate a matrix scheme based on such ^’s as a scheme which makes the a’s 
diagonal. 

Let a and g denote the matrices of two operators a and based on the 
same normal orthogonal system of functions. Then 

(a + 0)(fc',fc) = a{k\k) + (444) 

If atifpk is quadratically integrable for all values of fc, and a has an adjoint 
with respect to a manifold of functions that includes not only every iph 
but also every fi(pkj then 

{a0){k\k) 

From (22-32) we infer that 

(am'M = ( 44 - 5 ) 

k'’ k" 

It follows from the above hypotheses that the sums ^\{afi)(k'yk)\^, 

k' 

5^I]^a(fe',A;")iS(fc",fc)P are convergent. Conversely, if a<pk and ^tpk are 

k'^k" ' 

quadratically integrable for every fc, and if the sum /3(fc",fc)| 

k' 

is convergent for every ky there must exist an operator y defined by 

* footnote 1, ji. 259. 



Sec. 44] 


MATRICES AND OPERATORS 


355 


It can be proved by interchanging the order of two summations that 7 . is 
always equal to Equations (44*4) and (44-5) show that the sum of 
the matrices of two operators is equal to the matrix of the sum of the 
operators and that the product of the matrices is equal to the matrix 
of the product of the operators taken in the same order. 

It follows from the definition of the reciprocal of a matrix that if 
the operator a has a unique reciprocal the matrix of in any 
scheme, is equal to (a)“^ 

Thus we see that for a very wide class of functions of operators there 
exist corresponding matrix functions such that 

||/(«, ^, • • • )ll =/(«,?, ‘ ) (44-6) 

Incidentally, if an operator a is unitary, i.c., if it has an adjoint 
which is also its reciprocal, the matrix of at will be adjoint and reciprocal 
to tt. Thus unitary operators always yield unitary matrices. 

It has been assumed hillu'Tto that the members otthe normal orthogonal basic 
set of functions (pi, ^ 3 , • * • are distinguished by single ordinal number indices. 
There is no reason, however, why they should not be labeled with the multiple indices, 
say n,Z,m, so useful in connection with problems involving degeneracy. The expan- 
sion (44- 1) then takes the form 

otipnirn = ^ n,l,m)y (44*7) 

and the multiplication rule (44-5) becomes 

nd,ni) = a(n",Z'Vw/'; n\1\m')^in%l',m'; n,l^m). 

44b, Canonical Matrix Transformations. — ^Let us now consider a 
reversible linear transformation from one normal orthogonal basic system 
of functions ^ 1 , (p 2 , ^ 3 , • • • to a second normal orthogonal system ^ 1 , ^ 2 , 
^ 8 , • • - . (For convenience the subscripts k and n will be used for the 
^’s and ^’s, respectively.) Such a transformation is defined by the 
equations 

= X<PH{x)U(Ji,n). n = 1, 2, 3, • • • (44-8) 

k 

The matrix TJ = \\U(k,n)\\ is called the matrix of the transformation. 

Since the and the ^^s both form normal orthogonal systems, 

= 2t7t(n",fc)[7(*,n0 = 

k k 


or 


UtU = 1. 


(44-9) 



356 MATRIX THEORY [Chap. X 

Let us write the inverse transformation in the form 

Vk = (44-10) 

n 

Substitution of (44*8) into (44- 10) yields 

<Pk = XXwUik',n)W{n,k), (44-11) 

k' n 

from which it follows that UW = I. Multiplying each side of Eq. (44*9) 
on the left by U on the right by W, we obtain 

UUt = L (44.12) 

Thus the matrix U defined by (44-8) is unitary, W = XJt = U“^ 


The ordered set of complex numbers f/(l,n), (/(2,n), f/(3,n), • • • 

constitutes a complex vector in a space of infinitely many dimensions.^ 
From (44-8) we learn that it is a vector representation of the function ipr 
based on the tp functions as unit vectors. U is compounded from the 

totality of these Un vectors, one for each column, as n runs through all 
possible values. Equation (44*9) states that the ve(*tors form a normal 
orthogonal set. Equation (44T2) in turn states that the vectors defined 
by the rows of the matrix U also form a normal orthogonal set. These 
two s(‘ts of relations constitute a parallel to the nine relations between 
the direction cosines in a rotation of the reference axes in ordinary three- 
dimensional space. Thus the unitary transformation (44-8) is a gen- 
eralization of the homogeneous orthogonal linear transformation of 
elementary Euclidean geometry. We call it a ^^rotation of the axes 
in function space.” 

Let ypA be an arbitrary wave function with the expansion 

\I/A == ^<Pk^k = (44T3) 

k n 

Here Vn are simply the appropriate Fourier coefficients of \pA- Intro- 
ducing the transformation (44-8) and its inverse, we readily derive 

= Xuik,n)r,n; Vn = 

n k 

— ► ► 

Let I and n denote the vector representations of rpA formed from the (pkS 
and the ypnS, respectively, and treated as one-column matrices. The 
transformation equations (44T4) then take the form 

^ Un; n = (4416) 

le/. Sec. 22e, p. 119. 



Sec. 44] 


MATRICES AND OPERATORS 


357 


The scalar product of any two functions ^pAy which are expansible in 
terms of the bavsic set of functions <^ 2 , (Ph * ' * is equal to the scalar 
product of the corresponding vectors (cf. Sec.. 22, p. 120). It follows 

that the scalar produc.t of two such vectors, and ^/f, is independent 
of the base system and hence invariant of a transformation from one 
base system to another. This invariancy of the scalar product justifies 
our characterization of this type of transformation as a generalization 
of the rotation of axes in elementary p]ucUdean geometry. It also shows 
that there is a fundamental harmony between our definition of a unitary 
matrix and the definition of a unitary transformation on p. 247. In 
the latter place we defined a unitary transformation as the application 
to a function of a reversible operator which pn^serves scalar products. 
We now find that a homogeneous linear transformation from one set 
of basic functions to another yields a transformation of vec.tor repre- 
sentations of functions which preserves scalar products provided that the 
matrix of th(^ transformation is unitary. Thus wc n^fer to the trans- 
formation described by Eqs. (44-8), (44T4), and (44*15) as a unitary 
transformation. But a vector may be regarded as a function of a 
variable, say n, defined only for integral values of n. Hence Eqs. (44*15) 
form a parallel to Eqs. (36*2) with [UX] playing the part of a unitary 
oj)erator. 

Let us now examine the corresi)onding formulas for the transformation 
of the matrix of an operator a. We assume that a has an adjoint a"* 
with respect to a linear manifold of functions whi(*h includes all the 
and all the ^^s. Let and denote respectively the matrices 
of a for the ip and (p systems of base functions. It follows from the 
completeness relation (22*32) that 

a(n',n")('^> = (44*16) 

k' 

But is equal to lP{n\k') or to Also 

ioLyl/n",(pk>) = (^n",aVv) = 

k" 

= X^ik",n'')a{k',k"y^\ 
y* 

Hence 

oi{n',n"Y*^ = ^ U-^{n',k')a{k',k"y‘^W{k",n"). (44-17) 

This equation has the equivalent form^ 

(44*18) 

^ Equation (44- 18) differs in form from Eq. (77) of Kemble-Hill, lac. cit.^ footnote 
1, p. 2W, because of a difference in the convention regarding the numbering of the 
elements of the transforrnatkin matrix. 



368 MATRIX THEORY [Chap. X 

We call the transform of by the unitary matrix U. It follows 
from (44*18) that 

a(^) = (44*19) 

[The reader will note the parallelism of Eqs. (36*5) and (44*19).] 

Applying the theorem (43*7) to the right-hand member of Eq. (44*18) 
we obtain 

(a(^>)t = (44*20) 

Hence the transforms of two adjoint matrices are themselves adjoint. 
If is Herniitian it follows from this that is also Hermitian (c/. 
p. 351). 

The transformation of the matrix representation of a dynamical 
variable from one set of basic coordinate functions to another by means 
of a unitary matrix U in accordance with Eqs. (44-18) and (44*19) is 
called a canonical matrix transformation. It is really a special case 
of the canonical operator transformation defined in Sec. 36c, p. 247. 

The functional relation between different matrices is not altered by the 
application of a canonical transformation. Thus 

U-Ka + 5)U = U-iaU + U-'gU, 

U“'«gu = (U-"iaU)(U-igU). 

Hence if f denotes any function of the matrices « and 5 formed by 
repeated addition and multiplication (including multiplication by an 
ordinary complex number)/ 

U~if(a, g, * • • )U = f(U~i«U, U-«5U, * * • ). (44*21) 

As a corollary on p]qs. (44*20) and (44*21) we conclude that if a matrix a is 
unitary, its transform U"“^aU by any other unitary matrix U is also 
unitary. 

Although in general the matrices of quantum mechanics have an 
infinite number of rows and columns, we frequently have to do with 
problems in which finite matrices appear and are subject to unitary 
transformations. Hence it is of importance to know that the sum of 
the diagonal terms of a finite square matrix is invariant with respect to a 
transformation of the form 

B - S-'AS. 

To prove this proposition we have only to write out in detail the expres- 
sion for the transformed value of the sum, viz,, 

XB(n,n) = ^ S-^(,n,k)A(,k,k')S(.k',n), (44-22) 

n 

1 We can easily extend the validity of (44-21) to more general functions. For 
example, if exists, Is immediately verified; furthermore, 

if «* *» then or IJ-'oiU =* 



Sue. 44] MATRICES AND OPERATORS 359 

reverse the order of summation and apply the relation 
^S{k',n)S~\n,k) = h'k. 

n 

The diagonal sum of a finite square matrix A is called the spur or trace of 
the matrix and is indicated by the symbol Spur A. 

We are now prepared to consider from the matrix point of view the 
problem of determining the discrete eigenvalues of a dynamical variable 
together with the corresponding eigenfunctions. 

44c. Matrix Form of the Eigenvalue -eigenfunction Problem. — We 
assume that the linear operator a has an adjoint at with respect to a 
linear manifold A whic^h includes the complete normal orthogonal system 
of functions <^i, <p 2 , ’ ' ‘ and seek to find solutions of the eigenvalue- 
eigenfunction equation 

a\//(x) = a\fi(x) (44*23) 

which belong to the manifold A. We call such solutions ^^discrete 
eigenfunctions'^ of A. Using the notation of (44*13) we denote the 
Fourier coefficients of the desired function rj/ with respect to (ph by f/b. 
Then 

= (}P,oA<Pk). 

Since the v?'s form a complete system we can evaluate the last scalar prod- 
uct by means of the Fourier coefficients of ^ and aipk- Thus 

ah = '2,haHn,k)* = ^a{k,n)h. .fc = 1, 2, 3, • • • (44-24) 

n n 

lliis is equivalent to the matrix equation 

aT - o^, (44*25) 

which is the matrix form of (44*23). The number a is referred to as an 
eigenvalue of the matrix a as well as of the operator a. 

The components of a can be worked out by quadrature when a is 
defined and the basic set of ^'s has been chosen. Hence a may be con- 
sidered known. There are an infinite number of Eqs. (44*24) correspond- 

—*4 

ing to the infinite number of unknown components of The eigenvalue 
a is also unknown. It follows from our method of derivation that if 
a has discrete eigenfunctions there must exist corresponding eigenvalues 

and vectors { which satisfy (44*25). The latter are called eigenvectors 
of the matrix a. In the case of every eigenvector is convergent. 

k 

If ^ is normalized, its vector representation must also be normalized 
according to the rule = 1. This is always possible as (44*25) 



360 


MATRIX THEORY 


[Chap. X 


determines ^ at most to an arbitrary (constant factor. Conversely, 
if we find a solution ^ of (44*25) with convergent sums 

k k n k n 

it is easy to prove (r/., p. 353) that a corresponding func^tion 4/{x) exists 
which belongs to the adjoint manifold of a, has the Fourier coeffi- 
cients = hy and has the property of redu(*ing all corresponding 

Fourier coefficients of the two sides of (44*23) to (Hpiality. This does 
not prove that the two sides of (44*23) are equal at every point, but 
we can still count the function xp as an eigenfunction of a for quantum- 

mechanical purposes. The series always converges in the mean 

k 

upon this function and will ordinarily converge upon it at every point. 

If the matrix a is Hermitiaii, its eigenvalues are real. Thus 
— » — > 

implies Hence 

e+a? - = (a - a*)l% (44*26) 

If a is Hermitian the left-hand side vanishes. Since or 

k 

cannot vanish, it follows that a — a*, or that a is real. If the matrix a is 
unitary, its eigenvalues have unit absolute value, for in this case a? = di 
implies = a*?^. Hence 

0 = - l)VZ (4i-27) 

In either case eigenvectors belonging to different eigenvalues must be 

orthogonal. Thus if a is Hermitian and ^2 are eigenvectors with the 
eigenvalues ai, a 2 , respectively, 

0 = = (a, ~ a2%t?i - (ai - a2*)(iS). (44*28) 

If a is unitary 

0 - ?2ta-^a?i - ?2+?i = (a2*a, ~ l)?2+?i. (44*29) 

It is always possible to form a set of n mutually orthogonal and 
normalized vectors { 2 , * * * , f n from any set of n linearly independent 
vectors Xi^ ^ Xn by appropriate linear combination.^ Since 

^€f, Cocbant-Hilbert, M,M*P.y 2d ed., Chap. 11, p. 41, for procedure. 



Sec. 44; 


MATRICES AND OPERATORS 


361 


any linear combination of solutions of (44*25) is a solution, we can always 
form n orthogonal eig(jnvectors from any n linearly independent eigen- 
vectors with a common eigenvalue. 

Usually the actual solution of the infinite set of homogeneous linear 
equations (44*24) is very difficult and we have to be content with approxi- 
mate solutions obtained by perturbation methods (c/. Chap. XI). Since 
the problem is an obvious extrapolation of the simpler one of finding the 
eigenvalues of a finite square matrix, it is best to examine this latter 
problem first, espcicially as the solution of suc^h a simplified case is the 
usual first step in dc^aling with the infinite case. We accordingly seek 

eigenvectors of the g X g element matrix A = H/lmnU. Let x = l|a;A:|| 
denote such a vector. By (44*24) its components must yield a nontrivial 
(^.c., nonvanishing) solution of the set of g equations 

- a8„,n)Xn = 0. m = 1, 2, * * * , (44*30) 

n 

Such a solution exists only if the determinant of the coefficients vanishes, 
i.c., if a is a root of the so-called ‘^secular^’ equation 


det (A — al) 


.d 11 — a 
A21 


\A„i 


A 

A 


— a. 

• * • ^^ 


• Agg Oi 


= 0. 


(44*31) 


This equation is of degree g in the unknown a and has accordingly g roots. 
If w(j insert for a in Eqs. (44*30) a root of (44*31) of multiplicity p, the 
matrix of the coefficients in the former set of equations has its rank 
reduced to ~ p. There are then p linearly independent solutions which 
are readily derived by methods described in texts on algebra. From 
th(un we can form p orthogonal eigenvectors in an infinite number of 
ways, if p > 1. 

In the more difficult case of Eq. (44*25) where the matrix a has an 
infinite number of rows and columns and the eigenvectors have an infinite 
number of elements, the solutions cannot be found so easily, nor can 
we be sure a priori that they even exist. Intuition would suggest, 
however, the possibility of dealing with (44*25) by means of a method of 
successive approximations in which at each step we have to do with 

finite matrices obtained from a and ^ by arbitrarily limiting the number 
of rows and columns considered. For the present we reserve further 
discussion of methods for solving (44*25). 



362 MATRIX THEORY ^ [Chap. X 

If we find a normal orthogonal set of eigenvectors of a and designate 

them by ? * * • » preferably so as to bring the vectors with a 

common eigenvalue together and to arrange the eigenvalues in the 
order of their algebraic magnitude (assuming the eigenvalues to be real), 
we have the material for the construction of a matrix 




ii"> 


UH, 1) 

U(l,2). . . 

u = 

is"' 

is"' 


U{2, 1) 

U(2,2). . . 


such that U+U = I. If a„ denotes the eigenvalue which goes with the 
eigenvector and the eigenfunction ypn, Eq. (44*24) takes the form 

^a(Jc,m)lJ{m^n) = anUikyti), (44*32) 

m 

Denoting the diagonal matrix ||an5nm|| by A, we can gather together the 
different equations (44*24) for the different eigenvectors into the single 
equation 

aU = UA. (44*33) 

Let us now introduce the assumption that the operator a has a com’- 
plete orthonormal system of discrete eigenfunctions. The (p functions will 
then be expansible in terms of the discrete eigenfunctionSy ypm of a and the 
transformation 

k 

will be reversible. Under these circumstances the matrix U has a reciprocal, 
viz., the matrix of the inverse transformation, and is unitary. Multiplying 
each side of (44*33) by we obtain 

U-'aU = A. (44*34) 

If a unitary matrix U can be found which satisfies Eq. (44*34), it must 
contain all the eigenvectors of (44*25) and by (44*19) it reduces a to a 
diagonal form which contains all the eigenvalues. The eigenvalue problem 
of matrix theory is usually stated as that of finding such a matrix. If, 
however, the operator a has a partially continuous spectrum, its discrete 
eigenfunctions will not form a complete set and no proper unitary matrix 
U which satisfies (44*34) exists. In cases of this kind Eqs. (44*24) 
and (44*25) are still useful and have solutions which describe the incom- 
plete discrete spectrum of a. 

A transformation such as (44*34) which reduces a matrix a to diagonal 
form is frequently referred to as a principaUaxis transformation since 

it converts the quadratic form into a sum of squares 

%} 



MATRICES AND OPERATORS 


Sec. 44] 


363 


and so parallels the reduction of the equation for a quadric surface in 
three dimensions to its principal axes. 

The matrix U of Eqs. (44-33) and (44*34) is not uniquely defined, for, as previ- 
ously stated, a unitary transformation of the eigenvectors belonging to any given 
eigenvalue among themselves yields a new set of eigenvectors as good as the first. 
Let Ui and U 2 denote any two solutions of Eq. (44-34). Then ^2 == UiV where V 
is a unitary matrix. By hypothesis 

Ul“^aUl ~ Al, U2"'^0£U2 = A2» 

where Aj and A 2 are diagonal. Then 

V-^AiV = A2. 

Multiplying through by the prefactor V and writing out the expression for the typical 
element of each side of the resulting equation, we obtain 


Ai{k,k)V{k,k') = V{k,k')A2(k\k'). 


It follows that all elements of V must vanish except those which relate rows and 
columns Vjelonging to a common eigenvalue of a. Conversely any unitarj^ matrix 
with this property transforms one solution of (44-34) into another. If we apply the 
transformation 


k 

to the functions 'U\(k%k) we shall simply form new eigenfunctions of a by 

A: 

taking linear (X)inbi nations of the old belonging to a common eigenvalue. If the 
numbering of the matrix U is such that Ai is an ordered diagonal matrix {i.e., one with 
all eq\ial values of the diagonal elements grouped together), the matrix V will have a 
step form such that if we think of V as laid on top of Ai the steps of V will be sq\iares 
built about corresponding sets of equal diagonal elements of Ai. 

Each step of such a unitary matrix V will have a determinant of unit absolute 
value (cf. p. 351). If the eigenvalues of a are non degenerate, V will be diagonal and 
its nonvanisliing elements will be of unit absolute value. 

*44d. Matrices with Continuous Elements. — The matrix fofmulation 
of the eigenvalue-eigenfunction problem is so useful that it is frequently 
desirable to extend this type of formalism to problems involving con- 
tinuous spectra. One way to do this is to follow^ a procedure used in. 
Secs. 30 and 32j, introducing in the beginning an arbitrary bounding 
surface, or box, on which the wave function is required to vanish, carrying 
the problem through to the point of deriving the results to be compared 
with experiment and then in the final formulas allowing the surface 
of the box to move out to infinity.^ 

Another method in which Dirac has led the way is to extend the 
concept of a matrix to include functions of two independent variables, 

' The problem has been admirably discussed by J. Frenkel, W(we Mechanics^ 
Advanced General Theory^ Secs. 10 and 14, Oxford, 1934. 



364 


MATRIX THEORY 


fOlIAP. X 


or sets of variables, which take on continuous as well as discontinuous 
ranges of values. This procedure involves difficulties of the same kind 
as those met in trying to expand wave functions in terms of ongenfunctions 
of multiplication operators {cf. Secs. 36(7, /i). These difficulties are 
overcome as before by the introduction of pathological functions which 
disregard the convergence troubles which arise when we attempt to 
reverse the order of certain limiting processes. A formalism is thus set 
up which permits one readily to carry through the early stages of a 
calculation without merging the continuous spectrum with the discrete 
spectrum. Then, at the end of the calculation in the final formulas to be 
compared with experiment, the proper order of the limiting proc^esses 
is restored, the formulas are reinterpreted by using eigendifiPerentials 
instead of wave functions and taking the limits as the range of the 
eigendifferentials approaches zero. 

In order to introduce this Dirac method we revert to his notation 
(c/. Sec. Z&g) using the symbol (a'|/3') to indicate the probability amplitude 
in the space of/, * * * for a state in which the variables /3i, IS 2 , ' ‘ ' 
are known to have the eigenvalues /S/, 0^, * * * . Such a probability 
amplitude can be regarded as a generalizeKl matrix with rows labeled 
by the eigenvalues of the and columns labeled by the eigenvalues 
of the jS^s. In multiplying such matrices we must resort to integration 
in place of summation to cover the continuous ranges of the independent 
variables. Thus we define the product of (a'|/3') into as the gen- 

eralized matrix 

(«>') = 

Thus all scalar producits can be interpreted as matrix products and the 
formulas of Sec. 36i for the transfoiTnation of probability amplitudes 
from one set of independent variables to another become matrix-product 
formulas. 

The normalization condition for the probability amplitude (of'l/?') is 
given by Eq. (36*9) or, implicitly and combined with the orthogonality 
condition, by Eqs. (36-11) and (36-12). This orthogonality-normalization 
condition can be used to show that (a'|i3') regarded as a matrix is unitary. 

Let (af'|jS')i; denote the mean value of over a hypercube of 

side ri in /3" space with its center at /S" = Equation (36-11) can be 
given the form 

lim (44-35) 

where 

(44-36) 

Thus /2,(/3",|8') can be regarded as an approximation to a function 



Sec. 44] 


MATRICES AND OPERATORS 


365 


(i(3"|i8') with the property that 

V • • • 1/31', ^ 2 ', • • • ) = ^(/3i',^2', • • • ). 

(44.37) 

If all the have continuous spectra, this function is a product 

of Dirac 8 functions, one for each dynamical variable If all have 
discrete spectra, it is a product of the form , v^here 

is defined as unity when /Si" = fix and zero otherwise. Following 
Dirac we ignore the difficulty of interchanging limiting processes and 
replace (44*36) formally by 

( 44 - 38 ) 

Comparing with (36*75) we see that (j(3'|/3") is in fact the probability 
amplitude in 13' space when 0 i, / 32 , * * * are known to have the values 
jSi", 182 ", * • * . Regarded as a matrix the function O' 1/3") is diagonal. 
The diagonal elements in the discrete case are unity. Diagonal elements 
which are not purely discrete are infinite. Nevertheless (| 8 '|jS") plays the 
part of a unit matrix, for if multiplied in either order into any other 
matrix {/(/3',/3") it yields gil3',fi"), 

= oW). 

The left-hand member of (44*38) is the typical element of the product of 
(iS'la') by its adjoint (a'O'). Hence (44*38) states that (a'liS') is unitary. 

From this point of view (44*8) is seen as a special case of (36*75) in 
which the functions (/3'|a'), (|3'|a;'), (x'\a) are replaced by <pk{x), 

and U(k,n)j respectively. Thus the matrix U of (44*8) plays the part 
of a probability amplitude. 

Let us turn now to the matrix of an operator 7 ^ 7 ^^ with respect to 
a basic set of orthonormal functions (x'1/8'). We assume that 7 has an 
adjoint 7 t with respect to x' space and to a class of functions which includes 
the quadratically integrable functions (x'|/3') and also the eigendifferen- 
tials of those functions which are not quadratically integrable. If 
(x'\fi') is not quadratically integrable in x' space, we cannot expect to 
expand y{x'\^') in terms of simultaneous eigenfunctions of the i 8 \s.^ 
However, if {x'\fi')r, is the mean value of (x'ljS') over a hypercube of side r) 
in i3' space (cf. Secs. 36g and 36^), lim {x'\^')r, = (x'1/3') while 7 * 

can be so expanded. This fact is formally represented by the equation 

y’W)=^%'„WW\y\n ( 44 - 39 ). 

^ Here the dot is introduced after the operator 7 to prevent confusion between 
the transform of (x^\l3') by 7 , 7 • and a function of 7 with the arguments 

»',/ 3 '. 



366 


MATRIX THEORY 


[Chap. X 


Here the expansion coefficients (j9"|7|i30 form a matrix similar to the 
I|a(fc",fcOII of (44-3). The Dirac notation is here substituted 

for the notation because it gives our equations a particularly 

symmetrical form. Explicitly has the value 

= Xjnx') y • (x'ln = fy\n* y • <,x'\^')dx{dx^ • • • . 

(44-40) 

If 7 + == 7 , the matrix (/3"|7|/30 is evidently Hermitian. We can now 
generalize Eq. (44* 19) to 

if we replace the ^n^s by simultaneous eigenfunctions of the a^s in x' space 
and the by simultaneous eigenfunctions of the j0\s. 

To obtain the generalized equivalent of the eigen valued-eigenfunction 
problem of Eq. (44-34) we identify the operator y with one of the a^s, say 
ai. In a' space this operator is multiplicative and, by (44*40), (a'|ai|a") 

is 2) X«'lx')ai • (:2^'|«") or ai"(«'|a")- This is the typical diagonal 

matrix of the mixed discrete-continuous type which must take the place 
of A in (44-34). The latter equation is now replaced by 

= «i"(a'|a"). (44-42) 

Taking the product of each side by (i8"'|a') and reducing, we obtain 

= aX'irW) (44-43) 

as the equivalent of (44-33). 

The recognition that the totality of the elements of the transformation 
matrix U of (44-34) constitutes a probability amplitude is perhaps the 
point of chief physical interest in the Dirac-Jordan transformation 
theory, 

45. THE MATRIX THEORY OF HEISENBERG, BORN, AND JORDAN^ 

46a. Fundamental Postulates. — In this volume the Schrodinger wave 
equation is treated as fundamental and matrices are introduced as tools 
for solving problems based on this equation. In the early formulation 
of the theory by Heisenberg, Bom, and Jordan, however, matrices were 
primary and had to be dealt with independently of any relationship to a 
wave equation or to wave functions. The H. B. J. theory was based on 

1 W. H'eisbhbebg, Zett 8 ,f. Physik S3, 879 (1926), M. Bohn, W, Heisenberg, and 
P. Jordan, ZeUs, f, Physik 85, 667 (1926). C/. also Bom-Jordan, 



Sec. 45 ] THEORY OF HEISENBERG, BORN, AND JORDAN 367 

* 

Bohr^s correspondence principle which will be discussed^ in Sec. 46. 
We here proceed to a brief summary of the fundamental postulates. 

a. The “motions’’ of a mechanical system with a discrete-energy-level 
spectrum are to be described by the variation in time of matrices for the 
coordinates and conjugate momenta. Possible choices of coordinates 
and momenta are based on classical theory. To distinguish between the 
matrices defined by Eqs. (44*1) and (44-2) and the matrices here postu- 
lated when necessary we shall refer to the former as Schrodinger matrices 
and to the latter as Heisenberg matrices. The latter will be proved to 
be a special case of the former. 

b. The Heisenberg matrices are Hermitian and have quadratically 
summable rows and columns. 

c. Their elements are exponential harmonic functions of the time. 
Thus, if qk is a coordinate and pk the conjugate momentum, the elements 
of their matrices satisfy the relations 

qk(n,Tn) = 

Pk(n,m) = ^ ^ 

The frequencies here introduced are related to the possibility energy 
levels of the system by the Einstein law 

hvnm En- En^. (45*2) 

Thus every row and every column is associated with a definite energy 
level. 

d. The Heisenberg matrices of a system of coordinates 72 , • * * , 
and their conjugate momenta pi, • • • p/ are subject to the commutation 
rule 

[qitipi] = - q*pj] = ihi, ( 45 - 3 ) 


where I is a unit matrix. It follows from this relation that the matrices 
must all be infinite.^ A system of matrices which conforms to (45*3) is 
said to be canonical. 

e. The Heisenberg matrices are subject to equations of motion of the 
canonical Hamiltonian form 


' dq* __ dH dpk _ 

dt dpib’ dt dq*’ 


(45-4) 


where the Hamiltonian matrix H is a function of the q’s and p’s and the 
partial derivatives are defined by the rule 


af(xi^t, • • • ) 
dXk 


a— >0 L ^ J 

(45-5) 


1 Cf , Born-Jokdan, E . Q ., p. 90. 



368 MATRIX THEORY [Chap. X 

4 

A canonical system of matrices which satisfies Eqs. (45*4) is said to form 
a canonical solution of the equations of motion. 

f. The squares of the absolute values of the elements of the Hensenberg 
electric-moment matrix 

5 = (45-6) 

. i 

are reflected in the intensities of the spectrum lines when atomic systems 
emit and absorb radiation in a discharge or absorption tube. The 

relation between D and the line intensities will be more fully discussed 
in Sec. 54. 

46b. Correlation of the Heisenberg and Schrodinger Theories. — In 

order to relate the two forms of quantum theory we note first of all that 
postulate b is satisfied if we take the p’s and q’s to be Schrodinger matrices 
for a set of Hermitian operators whose Hermitian domains include 
class D and choose the basic orthonormal sequence (p 2 j ' ' ' from that 
class. Furthermore, the commutation rule d becomes a corollary on the 
corresponding operator rule (37*10) if we require that the ope^rators 
Pk, Qk shall be canonically conjugate and that the transform of every ipk 
by any one of these operators shall belong to the Hermitian domain of the 
others. If this last condition is not satisfied, the products pArqA? and 
qfcpfc may not exist. ^ 

If the g’s are taken to be the Cartesian coordinates and the p’s their 
momenta, it follows simply from the definition of class D that the first 
of the two sufficient conditions for (46*3) is satisfied. This is the usual 
choice. While other choices may be valid, they must be examined with 
care, since (45*3) does not hold for the Schrodinger matrices of all pairs of 
canonically conjugate operators, as was at one time supposed. 

In order to satisfy postulate c with Schrodinger matrices it is neces- 
sary that the phases of the base functions shall be harmonic functions 
of t. It is sufficient to choose for the (^’s an orthonormal system of 
simultaneous solutions of the two Schrodinger equations == and 

^ In that case 
2in dt 

~Ent 

(pn = ^ (45*7) 

^ It follows from a theorem introduced on p. 354 that the quadratic summability 
of the rows and columns of the product matrices affords an alternative hypothesis 
necessary and sufficient to make (45-3) a coibllary on (37 10). 

* Thus Born and Jordan have shown (c/. E.Q,, p. 91) that the matrix equation 
(45‘3) cannot hold true if the momentum operator has a purely discrete spectrum 
and a representation is used in which p is diagonal. A, case in point is obtained if 
we identify p with a component of the angular momentum, say JC,, and q with the con- 
iugate azimuthal angle (c/. end of Sec. 39a). 



369 


Sec. 45] THEORY OF HEISENBERG, BORN, AND JORDAN 
and 

qk{n,m) = J^ipn*qk(pmdT = ^"^^Jypn*qk4'mdT, (45-8) 

L('t UR designate these as canonical Schrodinger matrices A 

We turn next to postulate e and the equations of motion (45*4). 
These equations parallel the operator equations (39*26) just as the 
definition (45*5) parallels (39*18). Hence we can expect that with 
suitable restrictions the matrix form of the equations of motion will 
turn out to be a consequence of the operator form if we use canonical 
Schrodinger matrices. Actually there is no difficulty in a formal deriva- 
tion of the matrix equations of motion on this basis. However, the 
matrix equations can also be derived independ(uitly of the operator 
(^(luatjlf)ns by methods similar to those used in setting up the latter. 

The first step in deriving p]qs. (45*4) is to note that from P]q. (44*5) 
we are led to expect that the functional relationship between //, the 
Cartesian coordinates Xk, and their momenta pk is paralleled by a formally 
identical relationship between the matrices H, Xa:, p^. Thus the operator 
equation 

3 

U{x,,p,) = ^ 2 ^*' - 

for the internal energy of an hydrogenic atom leads to the corresponding 
matrix equation 

k L ^ _ 

The validity of this equation, like the validity of (45*3), rests upon the 
existence and quadratic summability of the various terms, but we can 
introduce the required postulate in good conscience for canonical Schrod- 
inger matrices and Cartesian coordinates. Furthermore, it follows 
from I]qs. (45*5) and (39*18) that dH/dpA, and dH/dqA: are the matrices 
of dH /dpk and dli/dquy respectively. Finally, as the basic functions are 
solutions of the second Schrodinger equation [c/. Eq. (39*25)] the matrices 
of the operators dqk/dt, dpk/dt are respectively equal to the time deriva- 
tives of q* and Pa;. Thus Eqs. (45*4) are established by equating the 
matrices of the right- and left-hand members of (39*26). 

These considerations suffice ♦to prove that canonical Schrodinger 
matrices of the Cartesian coordinates satisfy all the conditions imposed 
on the Heisenberg matrices by postulates b, c, d, e above. We have 

^ It is necessary to modify, the eigenvalue-eigenfunction problem for H in order to 
eliminate the continuous spectrum (c/. Sec. 3^') or to include a. continuous portion 
of each matrix. 




MAimiX THEORY 


370 


[Chap. X 


yet to consider whether the postulates are sufficient to determine the 
matrices uniquely or not. 

45c. Solution of Matrix Equations of Motion for an Ideal Linear 
Oscillator. — The problem of the ideal linear oscillator was the first to be 
solved by the matrix method.^ In this case the Hamiltonian matrix 
has the form 

f\2 

H(x,p) = I; + ^- (45-10) 

The equations of motion reduce to 

^ = i = 5, p = -fac. (45-11) 

Eliminating p and introducing the classical vibration frequency^ == 
{2Tr)‘'^{k/n)^^ we obtain 

X ^ \\{2Triv^nYx{m,n)\\ = 

Hence 

— Pc^)x(m,n) = 0. (45*12) 

This equation and (45*2) show that x(m,n) must vanish unless 

En,- En^ ±hve. (45*13) 

It follows that the energy levels, or diagonal elements of H, consist of one 
or more equally spaced series. In order to obtain a solution of our 
equations we assume (a) that there is just one such series and (6) that 
the energy levels are nondegenerate. Since the ordering of the rows and 
columns of the matrices is immaterial so long as it follows a consistent 
scheme, we can assume without loss of generality that 

v{n ■+- ’l,w) = — — — = Vc, (45*14) 

The nonvanishing elements of x will then be arranged in two lines parallel 
and adjacent to the principal diagonal. Thus 



0 

*(1,2) 

0 . . 


*(1,2)* 

0 

*(2,3) . . 


0 

*(2,3)* 

0 . . 


. • 


...... 


By (46*11) the matrix p will have the same form, since 
p(n,w) « 27nPnmfJtx(n,m), 

^ M. Bobn, W. Hxibenbkbo, and P. Jordan, Zeits, /. Phymk 85, 557 (19215). 



SBC. 45] THEORY OF HEISENBERG, BORN, AND JORDAN 


371 


It follows at once that all nondiagonal elements of [x,p] vanish as required 
by the commutation rule. The diagonal elements of [x,p] must have the 
common value unity. Consequently 


87v ^ 

k 


vc[\x{nyn + 1)P 


\x{nyn - 1)12]. (45-15) 


Thus the squares of the absolute values of the matrix elements + 1) 
form an arithmetical progression with the common difference h/Sir 
These terms are essentially positive so that n must have a minimum 
value which we can set equal to zero. For this lowest energy level, 
(45-15) reduces to 


k(0,l)P 


. A_ 


and we obtain for the general case 


\x{n,n + 1)|2 = (n + 


n = 0, 1, 2, • • • (46-16) 


For the matrix element x(n,n + 1) itself we have 

x{n,n -f 1) = x(n + l,n)* = [ (45-17) 

where the are arbitrary phase constants. 

Having evaluated x we can next determine p by (45-11) and H by 
(45-10). For the latter we find 

Hiriyfyi) = En^nm “ (jl “f" ^'2^hVc5nm> (45*18) 

The energies obtained in this way are the same as those worked out by 
the Schrodinger method in Sec. 20. The matrix components are also 
readily verified by the Schrodinger method. Thus 

x{n,m) = ypn^xyl/^dx. 

The substitution 

= CnHniOe"^, f 

yields 

x(n,m) = 

The recurrence formula (20-8) reduces the integral to a form which is 
readily evaluated with the aid of (M-10). The calculated values then 
agree with (45-17). 



372 


MATRIX THEORY 


[Chap. X 


Born and Jordan^ have proved that the above solution of the nmtrix 
problem is unique except for the physically meaningless phase constants 

<t>n- 

46d. Reduction of the Fundamental Problem of Matrix Mechanics 
to a Principal-axis Transformation. — In the matrix mechanics, as in 
the classical mechanics, a solution of the equations of motion reduces the 
Hamiltonian to a constant, provided that H(q,p) does not contain the 
time explicitly. This is the matrix form of the law of the conservation 
of energy. It is proved by the equation 


dK 

dt 


rs^\dKdqk , dHdpk] , 

2^ldqk dt dpk dt J ^ 2^ [ 

k 

men _m aH] ^ ^ 

dqit dpt dp* 3q/iJ 


2 


dqk dH , dpk dH 
dt dqk dt dpA 


] 

(45- 19) 


Equations (45-1) and (45-2) demand that the diagonal matrix 

E = ||£„5„„|| 

shall satisfy the relations 

(jfc = [qfc,E]; p* = [p*,E]. (45-20) 


Furthermore (45-20) implies that if f is a function of the q’s and p’s 

^ = [f,E]. (45-21) 


The proof will readily be supplied by the reader following the argument 
used in establishing Eqs. (39*22) and (39*23). We conclude from (45*19) 
and (45*21) that if H(q,p) is built up from a solution of the equations of 
motion, it must commute with the diagonal matrix E. We can assume 
without loss of generality that E is an ordered diagonal matrix. Then, 
by a rule given on p. 351, H must be a step matrix in which each step 
IS associated with a single energy En- 

Let us now apply a unitary transformation 

Qk - U-iqfcU, Pk = U-^pfcU. (45*22) 

Such a transformation preserves the functional relationships between 
matrices [cf, Eq. (44*21)] and hence transforms one canonical solution 
of the equations of motion into another. We choose U as a step matrix V 
similar to H. Then each step will be transformed according to the 
rule 

(V(«))"-i (45*23) 

It is always possible to convert a finite symmetric square matrix to 
^ Bobn and Jordan, E . Q .^ p. 23. 



Sbc. 45] 


THEORY OF HEISENBERG, BORN, AND JORDAN 


373 


diagonal form (c/. p. 361) by means of a properly chosen unitary trans- 
formation. Hence we can always choose V so that it makes H diagonal, 
i.e., we ean always find a principal-axis transformation which converts 
H into E. Thus if any canonical solution of the equations of motion 
exists there will be one which makes H diagonal. 

Conversely, if we can find a canonical set of matrices which make H 
diagonal we have a solution of the equations of motion. By (39*22) and 
(39*23) 

If - I? - (45-24) 

If H(q,t,P(fc) is diagonal and if the q’s and p’s are given phase factors in 
accordance with (45-1) and (45-2) 

- EMn,m) = 

Similarly the second set of Hamiltonian equations is satisfied, as was to be 
proved. 

This suggests that one way to solve the Heisenberg equations inde- 
pendent of the Schrodinger method would bo to start with an arbitrary 
special canonical system of matrices, say form the correspond- 

ing Hamiltonian = H(q^®^,p(‘^>) and seek a canonical transformation 
which will diagonalize Such a transformation, if it can be found, 

will preserve the canonical character of the q,p matrix system and give a 
solution of the equations of motion. It cannot always be found without 
making use of the generalized matrices whose rows and columns are not 
all discrete. However, this procedure does give the starting point of a 
systematic method of attacking problems in matrix mechanics which 
cannot be solved by such a simple direct approach as that used in dealing 
with the ideal linear oscillator. This is the basis of the perturbation 
theory of the matrix mechanics. 

The question now arises whether the solution of the equations of motion obtained 
in the above manner will be unique or will depend on the choice of the initial canonical 
system The point has been investigated by Born and Jordan, ^ who found 

it possible to insure uniqueness by imposing on the p<®>'s and q^^^’s the condition that 
[prV + [q*^^l* shall be transformable to diagonal form for every value of k by means 
of a properly chosen canonical transformation. This condition is equivalent to postu- 
lating that all matrices shall be derivable from, the matrices which solve the problem 
of a set of harmonic oscillators, since p* + q* is the Hamiltonian matrix of a harmonic 
oscillator of properly chosen mass and frequency. From the Schrodinger standpoint 
we should expect to start with the matrices based on a proper complete orthonormal 
set of base functions and just that is actually effected by this criterion of Bom and 
Jordan. 


^ Born and Jordan, p. 128. 



374 


MATRIX THEORY 


[Chap. X 


46. THE BOHR CORRESPONDENCE PRINCIPLE AND ITS RELATIONSHII 

TO MATRIX THEORY 

46a. The Bohr Postulates. — Historically the matrix mechanics is an 
outgrowth of the attempt to refine the correspondence principle of the 
quasi-classical Bohr theory. As this correspondence principle continues 
to be a useful tool for the heuristic examination of quantum-mechanicaJ 
problems we pause here to sketch the Bohr theory and its relation to 
matrix mechanics. 

Bohr’s work began with an attempt to reconcile the Rutherford 
nuclear atomic model with the empirical spectrum of hydrogen. His 
fundamental assumptions were, briefly, as follows: 

a. An atomic or molecular system can exist only in certain discrete . 
nonradiating ‘^stationary states” which define corresponding “allowed” 
energy levels. 

b. Transitions, or “jumps,” from one energy level to another take 
place and are accompanied by the emission or absorption of monochroma- 
tic radiation of frequency v according to the rule 

hv = (46-1) 

where J?' and E" are respectively the upper and lower energy levels in 
question. 

These were the primary postulates and carry over with some reinter- 
pretations to the quantum mechanics of systems with purely dis(?rete 
energy spectra. The third hypothesis, however, was of an essentially 
tentative and provisional character and was introduced not so much 
because Bohr believed it to be true as bec^ause at the time no useful 
alternative hypothesis suggested itself. 

c. When the system is in one of its stationary states the motion of 
the electrons and other particles which make up an atomic system takes 
place in accordance with the laws of classical mechanics, i.e., clai^ica! 
electrodynamics with radiation forces omitted. 

A fourth basic element in the theory, which took final form some time 
after the appearance of Bohr’s first papers, was the quantization rule. 
This rule, or quantum condition, first formulated by Sommerfeld, is 
applicable only to multiply-periodic or “conditionally” periodic systems. 
These are systems whose motions are such that the variation in time of 
each coordinate and momentum component can be represented by a 
multiple Fourier’s series with one or more independent basic frequencies 
wi, a?a, • • • , ojr, where r is not greater than the number of degrees of 
freedom. Thus if Xk is one of the coordinates, and if we choose the 
complex form of Fourier’s series, the expansion 

Xh{t) = 5) •••’■,) exp + • • • (46-2) 



Sec. 46] 


THE BOHR CORRESPONDENCE PRINCIPLE 


375 


must be possible when the system is multiply periodic. In this expansion 
the coefficients Xk{T\, * * • Tr) depend on the constants of integration 
which fix the orbit under consideration. 

In the case of such systems it is possible to introduce a set of so-called 
^'action variables^’ ./i, J 2 , * • * , /r, one for each of the independent 
basic frequencies, and having the following properties. They are 
constants of integration which fix the energy E and fulfill the conditions^ 

dE 

^ - = CO,, i = 1, 2, • • • , r (46 3) 


r 

= 2f, 

1 


(46*4) ' 


where f is the average value of the kinetic energy of the system over a 
long period of time. Usually it is possible to choose a set of generalized 
coordinates ^ 1 , ^ 2 , * * ‘ , < 7 / such that each is associated with a definite 
corresponding fundamental frequency co^. The integral J^Pkdqk extended 
over a complete cy(4e of the variable (jk is called the Sommerfeld phase 
integral (cf. Sec. 21/) for the coordinate qk^ Where these exist, one can 
identify eacdi action variable Jk. with the sum pf the phase integrals for 
all coordinates with the same frequency We can now state the 
fourth postulate in the following form. 

d. The stationary states of a multiply-periodic system comprise 
those states for which the action variables are integral multiples of 
Planck^s constant h. Thus for these allowed states 


Jh = ^^Pidqi = nich. = 0, 1, 2, • • • (46*5) 

i 

Later on it was discovered that the empiric^al facts could be fitted 
more accurately in some cases if one (or more) of th(^ J% was supposed to 
be an odd multiple of /i/2. This modification in the theory is evidently 
unessential in view of the arbitrary nature of d. 

46b. The Bohr Correspondence Principle and the Heisenberg Matrix 
Theory. — ^An important feature of the Bohr theory is the correlation 
which it permits between the classical basic frequencies wi, C 02 , * • * , Wr 
and the radiated frequencies (quantum frequencies) permitted by the 
rule (46*1). In order to explain this correlation we make use of an 
r-dimensional space in which each of the action variables is laid out at 
right angles to the others. In such a space each allowed energy level 
is represented by a lattice point defined by (46*5). Using primes to 
denote the quantum numbers Uk for the upper energy level and double 
primes for the lower energy level we define a “quantum transition 

1 Cf. J. H. Van Vleck, Quantum Principles and Line Spectra^ p. 18, Washington, 
D. G., 1926. 



376 MATRIX THEORY [Chap. X 

by the quantum numbers n/, ^ , n/ and by the differences 

ni' — ni" = Ti; = rs; ' * * ; n/ — n/' = Tr. 

The frequency of the radiation emitted in such a jump, viz.y 

[E{ni,n2, • • * ) -E(ni - Ti, 712 rz, • • • )] 

" == ~h _ ^ 

is readily identified with the average value of the combination frequency 

^(r) ^ + • * * + TrOJr 

evaluated along a line joining those points in J vspace which represent 
the initial and final energy levels.^ Thus every type of quantum transi- 
tion is correlated with a definite pair of terms in the Fourier expansion 
(46*2)2 and it becomes evident that if the frequencies hecA)me independent 
of the J values — as in the region of high quantum numbers — the quantum 
frequencies and the classical frequencies will he the same. 

In order to fit his theory with the (dassical theory of the radiation of 
macroscopic systems Bohr superimposed on this correlation principle th(^ 
assumption that in the region of high quantum numbers the intensities of the 
spectral lines become asyipptotically equal to the intensity of the radiation 
computed classically from the corresponding Fourier amplitude in (46*2), 
whether for the upper energy level or the lower one. Where the classical 
intensity for a given harmonic component is zero independent of the state for 
which it is computed {i.e.y independent of the J^s), the intensity of the 
radiation emitted in the corresponding type of quantum jump is zero. 

This is Bohr\s correspondence principle and the basis for the theoreti- 
cal treatment of selection rules governing ^‘allowed’’ and “forbidden^' 
transitions in the Bohr theory. 

The classical rate of emission for a system of charges with a multiply- 
periodic motion can be resolved into a sum of terms, one for each fre- 
quency. The expression for the electric vector in the emitted radiation 
field can also be broken up into a sum of terms giving the contributions 
of the electric dipole moment, the magnetic dipole moment, the electric 
quadrupole moment, etc.® Usually the dipole-moment term is so large 
that it swamps the others out. In that case the rate of emission is 

given by the familiar formula for an harmonic oscillator of electric 
— > 

moment d(0, viz., 

2 ^2 2 

Power radiated = ^ » 

For a more detailed treatment cf. J. H. Van Vleck, loc, cit., pp. 23-28. 

* Two terms in (46^2) obtainable one* from the other by reversing the signs of all the 
r’s, are associated with the same absolute frequency. These terms are conjugate 
complex quantities and can be united to form a single real harmonic term if desired. 

* Cf J. Fbbnkel, Elektrodynamik, Vol J, p. 158. 



Sec. 46J 


THE BOHR CORRESPONDENCE PRINCIPLE 


377 


provided that we identify d with the sum of the terms of appropriate 
frequency in the Fourier expansion of the electric moment D = ^ekTk. 

k 

Let us write the expansion in a form similar to (46-2), viz., 

^ 4* « 

^(0 = />(ri, • • • Tr) exp [27^^*(Tla>l + • * • + Tr03r)t], (46-6) 

The sum of the terms of frequency |ria>i + • • • + i« of the form 
2|D(ri, • • • , Tr)! COS l27r((Tia)i + * • • + T^Wr) + 6], 

where 0 is a phase constant previously buried in Z>(ri, * • * , r^). Using 

this expression for d{t) and computing the time average of the power 
radiated per atom, we obtain 

‘ y Tr) — IriWi + • • • + TrU3r\^ ' ‘ ' y Tr)\^- (46*7) 

Approximate estimates of the intensities of spec.trum lines were obtainable 
on the basis of the Bohr theory by means of (46-7), replacing 

|riWi -h • * • + TrO)r\ 

by the actual quantum frequency v of the emitted radiation and 

• • • ,T :)12 

by some sort of average of its values for the initial and final states under 
consideration. 

The Heisenberg-Born- Jordan matrix theory was a synthesis based 
on the correspondence principle, the Rydberg combination principle 
as expressed by (46-1), and the philosophical ‘ ‘ hunch that the material 
to be used in constructing a proper theory should be more nearly experi- 
mentally observable than the hypothetical orbits of the Bohr theory. 
In the position and intensity of each spectrum line the correspondence 
principle saw a reflection of the frequency and amplitude of a correspond- 
ing harmonic component of the motion. Heisenberg therefore sought to 
construct a theory which should relate these harmonic components 
directly to the Hamiltonian function of the atom. The identification 
of these components requires a double system of indices like that used 
for matrix elements and the attempt to form algebraic combinations 
in order to relate the motion to the energy without violating the Rydberg 
rule led to matrix multiplication and addition. We need not follpw the 
reasoning in detail. Suffice it to say here that from the nature of itS/ 



MATRIX THEORY 


378 


[Chap. X 


premises the H. B. J. theory inevitably replaced (46-7) by the formula^ 

I{n',n") = (46-8) 

in which n' and w" denote the complete sets of quantum numbers char- 
acteristic of the initial and final states, respectively, and D(n',7i") is the 

corresponding element of the Heisenberg matrix D. The formula 
(46*8) gives the rate of emission of energy per atom when the external 
radiation field is negligible. The rates of absorption and forced emission 
in a natural radiation field can be w’^orked out either by the aid of Ein- 
stein\s quasi-thermodynamic^ theory of the relation between emission ' 
and absorption transition probabilities (c/. Sec. 54), or from Van Vleck\s 
formulation of the correspondence principle for absorption.® 

Thus the Heisenberg theory was, so to speak, born with formulas 
for the rates of emission and absorption tied around its neck. This 
scheme for adapting classical intensity formulas to quantum mechanics 
is entirely successful in outcome but smacks too much of analogy to be 
entirely satisfying. In Sec. 54 we shall consider the justification of 
(46*8) from a more advanced point of view. 

A serious diflSculty with the original Bohr theory lay in the fact that 
it was applicable only to multiply-periodic motions, whereas a strict 
application of classical mecliaiiics to the accepted Rutherford atomic 
model would lead in practically all cases to essentially aperiodic motions. 
Nevertheless Bohr found it possible to correlate the various empirical 
quantum jumps accompanying the emission of radiation with correspond- 
ing terms of a Fourier^s series, using a plausible multiply-periodic idealiza- 
tion of the approved model. The formula (46-8) can be established on 
the basis of quantum mechanics independently of the correspondence 
principle and is not subject to the above limitation to a special type of 
motion. On the other hand, it is still sometimes convenient to use a 
multiply-periodic model as a basis for the approximate estimate of 
intensities by the correspondence-principle method. 

It follows from the relation established in Sec. 11 between the Schrod- 
inger equation and the corresponding classical Hamilton- Jacoby equation 
that if the variables can be separated in the former by the introduction 
of proper coordinates they can be separated in the latter by using the 
same coordinates. If the variables are separable in the Hamilton- 


^ In the usual case where the initial and final energy levels are degenerate, one 
uses (46*8) to compute the intensity of emission from each substate of the upper 
energy level to each substate of the lower level, adding these contributions to get the 
total intensity of the line. 

* A Einstein, Phys, 18, 121 (1917). 

» J. H. Van Vlbck, PhyB . Rw. 24 , 330, 347 (1924). 



Sec. 46] 


THE BOHR CORRESPONDENCE PRINCIPLE 


379 


Jacobi equation and if each coordinate has a definite frequency (c/. 
M. Born, Vorksungen uber Atommechanik,^H, Berlin, 1925), the classical 
motion is multiply-periodic. Other types of multiply-periodic; motions 
occur rarely, if at all. Thus it turns out that in practice the corre- 
spondence principle is usually applicable only if the variables in the 
Schrodinger equation are separable. 



CHAPTER XI 


TIIZORY OF PERTURBATIONS WHICH DO NOT INVOLVE THE 

TIME 

47. TH3 PERTURBAllON THEORY FOR NONDEGENERATE PROBLEMS 

47a. First-order Perturbations.— In (luantuni mechanics, as in the 
Bohr theory, perturbation methods are of fundamental importance due 
to the fact that so few problems can be rigorously solved l)y diroct 
attack. The essential feature of these methods is that one starts with 
an approximate solution of the problem in hand and proceeds to compute 
by “hammer and tongs” a series of corrections designed to improve the 
approximation. The success of such a computation depends partly 
on the patience and energy of the computor and partly on his ability 
to find a happy starting point. The successive approximations may not 
converge on an exact solution of the problem, bui usually the early 
corrections do yield an appreciable improvement on the initial wave 
functions. 

The conventional perturbation theory of wave mechanics^ is concerned 
with the determination of the discrete eigenvalues and eigenfunctions 
of a Hamiltonian operator Hig^O/dq). It attempts to approximate 
these values and functions with the aid of the rigorous solutions of the 
eigenvalue-(‘igenfunction problem of a simplified, but related, Hamil- 
tonian function Ho{q,d/dq) involving the same coordinates. We 
designate the problems based on the two oi)erators Ho and H as the 
unperturbed and the i)erturl)ed problems, respectively. In order to pass 
from the known discrete solutions, say of the unperturbed 

problem co corresponding solutions of the pcu’turbed problem, it is useful 
to construct a on e-para nn'ter continuous series of problems that bridge 
the gap between the two. To do this we introduce the symbol Hi for 
the difference H — Ho and define the operator H by the equation 

/7 = ffo + \Hi = /i'o + - Ho). • (47-1) 

Here X is a parameter which may take on any value between zero and 
unity. Since 3 reduces to Ho when X is zero and to H when X is unity, 
we call 3 the interpolation Hamiltonian. II i is called the perturbing 
Hamiltonian. The method of successive approximations which we here 
apply to 3 can also be applied to operators expressible as complete 
power series in a parameter X. 

^ C/. E. ScheOdinoek, Ann. d. Physsik (4) 80, 437 (1926). 

380 



Sec. 47] PERTURBATIONS FOR NONDBGENERATE PROBLEMS 381 


We assume that the solutions of the interpolation problem 

(47-2) 

are analytic functions of X in the interval 0 ^ X ^ 1 and therefore reduce 
to solutions of the unperturbed problem when X is zero and become 
solutions of the perturbed problem when X is unity. 

Exceptions to this ruk^ can occur when the perturbing Hamiltonian Hi 
introduces new singularities into the problem or modifies in a fundamental 
way the (conditions at preexisting singularities. It can happen, for 
example, that Ho has a discrete spectrum while H and ff have purely 
continuous spectra. In such cases solutions of the interpolation problem 
will change discontinuously when we pass from any positive value of X 
to the value zero. The 'possibility of such discontinuities requires special 
examination for each individual ense, hut in developing the general theory 
we shall assume that solutions of Eg. (47-2) form a truly continuous con- 
nection between the eigenfunctions — and eigenvalues — of the perturbed 
and unperturbed problems. 

In some cases the interpolated series of problems is wholly artificial, 
but in others, e.gf., when H i represents the perturbing effect of an external 
electri(c or magnetic; field, all valuccs of X give results of experimental 
significance and it is unnecessary to distinguish between the inte;rpolation 
Hamiltonian and the perturbed Hamiltonian. Of course the eventual 
artificiality of the R problem! has nothing to do with the mathematical 
procedure, or with its validity. 

Since degeneracy introduces considerable complications into perturba- 
tion theory, we begin with a study of cases in which //o and 3 have 
nondegenerate eigenvalues.^ Let there be given a c(3mplete normalized 
set of discrete and continuous-spectrum eigenfunctions of' 

the unperturbed problem 

(47-3) 

Let ^k{\q) denote the fcth normalized discrete eigenfunction of the 
interpolation problem (47*2). If the solutions of the two problems are 
suitably numbered, our continuity hypothesis requires that 

lim jBAr(X) = 
x~>o 

and that '^h(0,q) - lim ^ki\q) shall be an eigenfunction of (47-3). Then 
x-*o 

1 Cases of near degeneracy where the unperturbed eigenvalues occur in closely 
spaced groups are best treated by a suitable modification of the variation method 
(Sec. 51), or they can be reduced to degenerate form by pulling out of Ho the term 
responsible for the splitting of the groups and adding this term to the perturbing 
Ha^tonian. 



382 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 

^ik(0,g) = where a is an arbitrary phase factor which we can 

set equal to zero without loss of generality. 

Expanding Ek and ^k into power series in X, we ha,ve 


Ek - Ei^^ + 


+ 





Introducing the abbreviations 



we reduce these series to the form 


Ek - 


(47-4) 


(47*5) 


(47‘6) 


Our task is now to compute enough terms of these two series to obtain 
satisfactory approximations to the perturbed Ek and ypk- Whether or not 
the series actually converge for X = 1 to give a rigorous solution of the 
perturbed problem is a question we shall not attempt to discuss. If the 
singular-point characteristics of Ho and H are the same, there can be no 
reasonable doubt that those series will converge^ for sufficiently small 
values of X, and if the initial approximation is good, we may hope not 
only for convergence but for rapid convergence when X is unity. Unfor- 
tunately it is usually impracticable to compute more than two or three 
terms of these series so that the usefulness of the method is limited to 
cases in which the zero-order approximations E[^\ are quite good. 

Let us now develop a scheme for computing the first-order corrections 
to the energy and wave function, respectively. Using the 
notation of Eq. (47*5), differentiating (47*2) with respect to X, and giving 
X the value zero in the resulting equation, we obtain 

(Ho ~ ~ (47-7) 

This equation can be used for the determination of both E^^^ and 
It is not difficult to show that the left-hand member of this equation is 

* A. H. Wilson makes the statement that, at least, in cases which involve only 
discrete spectra, the convergence is that of an exponential series if there is any con- 
vergence at all. [C/. A. H Wilson, Proc, Roy, Soc, A, 124, 186 (1929).] 



Sec. 47] PERTURBATIONS FOR NONDEGENERATE PROBLEMS 383 

orthogonal to For, since Hq\h Hermitian with respect to functions 
of class Z>, it follows from (47*3) that 

- Ei»^m}>dr = fjlHH,* - = 0. (47-8) 

Hence the right-hand member of (47-7) is also orthogonal to and the 
value of El^'' is fixed by 

Ei'> = (47-9) 

In words Eq. (47*9) states that the first-order energy correction is equal 
to the mean value of the perturbing energy operator averaged over the unper- 
turbed^ or zero-order^ wave function [cf. Eq. (35*4)]. This is the quantum- 
mechanical equivalent of a familiar theorem of the perturbation theory 
of classical mechanics.^ 

We may now regard Eq. (47*7) as an inhomogeneous equation for the 
partial determination of The unknown function is not fully deter- 

mined by this equation, for we can add to any solution an arbitrary 
multiple of and get a new solution as good as the first. This indeter- 
minateness in is partially removed when we take into account the 
normalization requirement for rpkj which yields 

r ^A:^ib*drl s ( + f ypt^^^^^dr = 0. 

|_ 00 _|Xss =0 ac *y oo 

Hence the real part of the scalar produ(;t of \p)f'^ and yp)^'^ is zero. To fix 
uniquely we arbitrarily agree that the imaginary part of this scalar 
product shall vanish also. Thus 

In order to solve (47*7) we expand the unknown function yp^f'^ into 
a series of eigenfunctions of the unperturbed problem. In view of 
Eq. (47T0) there will be no term in ypf\ The expansion takes the form 

+ f^^i^W^^{E,k)dE, (47-11) 

where the coefficients U‘’^^{n,k), U^^'>iE,k) are unknowns to be deter- 
mined. The scheme is now to express each side of Eq. (47-7) as a linear 
combination of the known unperturbed eigenfunctions and to equate 
coefficients of corresponding terms. For this purpose we need the 
expansion of which we write in the form 

= X'l'TJliinM -I- f yp'im^iE,k)dE. (47-12) 

n 

' Cf.f for instance, J. H. Van Vleck, Quantum PrinciyUs and Line Sped>ra^ p. 203, 
Washington, 1926; or M. Born, Vorleaungm Uber Atommechunikj p. 287, Berlin, 1925. 



384 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XT 


Assuming the usual normalization for the discrete functions the 

matrix components Hi(n,fc) arc fixed in accordance with (44*2) by the 
explicit formula 

Hi{,n,k) = J* (47-13) 

Similarly if HiiplP'' is absolutely integrable [cf. Eqs. (36-13) and (36-14) 
and remark following latter equation], 

Hi{E,k) = f (47-14) 

Inserting the expansions (47-11) and (47-12) into Eq. (47-7) and 
equating coefficients of like terms on the two sides,' we obtain 

U^^K^,k) = ^ 

UO){E,k) = g,ord%- (47-16) 

These formulas combined with Eq. (47*11) complete the determination 
of the first-order correction to the wave fum'tions. 

47b. Second-order Perturbations. — In order to get the second-order 
corrections to Eh and \l/k we differentiate Eq. (47-2) twice with respect to 
\ and then set X equal to zero. Using the notation of Eqs. (47*5) the 
result takes the form 


(Ho ~ E^rPi^^ = Em^^ + (EIP - (47-17) 

As in the case of Eq. (47*7) the left-hand member is orthogonal to and 
we have 

- Ei'^)W^dr - (47-18) 

In virtue of Eqs. (47-11), (47*15), and (47-16) the expression for the 
second-order energy correction takes the final form 


= 2 

n^k 


\ Hx{k,n)\^ 

E^ox _ gcox 


+ 


C\Jil 

J Ei” 


(k,E)j^ 

Ei«> - E 


dE. 


(47-19) 


It will be observed that in this approximation each energy level is 
pushed upward by those below it and downward by those above it. 
The displacements are proportional to the squares of the corresponding 
matrix elements of Hj and inversely proportional to the separations of 
the unperturbed energy levels involved. Thus in general each level is 
most affected by its near neighbors. This tendency is accentuated by 

‘ The equating of coefficients can be justified by taking the scalar product of each 
side of the equation and the successive discrete eigenfunctions and eigendiiferentials 
of H„. 



Sec. 47] PERTURBATIONS FOR NON DEGENERATE PROBLEMS 385 


the fact that the matrix elements Hi{k^E) are usually small 

wheii the corresponding energy-level differences are large. 

In the case of the lowest energy level the quantity is essentially 
negative so that the graph of E'o against X is curved downward at the 
point X = 0. The same argument can be applied for any other value of 
X and we infer that the graph in question is concave downward throughout 
its course. 

The attention of the reader is called to the fact that to compute E)^'^ 
and E^^"^ we need only evaluate the approximations of next lower order to 
the wave function. This is a general characteristic of perturbation 
calculations in the semiclassical Bohr theory^ as well as in quantum 
mechanics. It means that for a given amount of labor we can always 
compute the (mergy levels more accurately than the wave functions. 
In this connection it is illuminating to recall that eigenvalues of the 
energy are the extreme values of integrals over the wave functions. 
Thus any small error in the assumed wave function tends to produce an 
error of higher order — i.e., a smaller error— in the energy. 

Consider m^xt the second-order correction to the wave function, which 
we represent by the series-integral expansion 

= Xen^^U^^>in,k) + f (47-20) 

n 


with unknown coefficients. A similar expansion is assumed for the 
function 

-f f (47-21) 


where 


= fiTU^^KE,k)dE]dr 

m 


- 2 ^ 

hsk = /. V'l {E,k)dE]dT. (47-23) 

m 


{n,m)H i{rn,k) 


+ 




{r^E)Ih{E,k) 
Ef -E' 


dE, (47-22) 


Each side of Eq. (47-17) can now be written as a linear combination 
of unperturbed eigenfunctions. Equating corresponding coefficients 
on the two sides of the resulting equation, we obtain 

0 = - h^, (47-24) 

_ E)^W^'^'>{n,k) = JBi»{7<»(w,*) - Kk, n 9^ k (47-25) 
(E - Ef )t/<»(S.fc) = Ei«t/<«(E,ft) - hnk. (47-26) 

* Cf. references in footnote 1, p. 383. 



386 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


The first of these equations is equivalent to Eq. (47T9), while Eq. (47-25) 
in conjunction with (47-22) yields 


t7(»(n,fc) 


1 

£(0) - £^0) - ’ 1- J - E 

m^k 


Hiin,k)Hi(k,ky 

£j,o) _ g(0) • • 


fir^k (47-27) 


To avoid undue complications we leave the expression for U^^^{E,k) 
in the form 


V»'(E,k) - gf'zTE ['•“ - <«-28) 

The coefficient is not determined by Eq. (47*27) but is 

restricted by the normalization condition for Differentiating the 
equation which expresses this condition twice with respect to X and 
setting X equal to zero, we obtain, in analogy with Eq. (47*10), the 
relation 


/. (lAf (47-29) 

Hence 


I/(«(fc,fc) + Ln^\k,k)* = -[]gK/<‘>(n,fc)l“ + J|f/^“(jB,A;)l*dE]- 

n 

We adopt the simplest choice of W^^{kjk) consistent with the above 
relation, viz.j 

[/<«(*,*) = + /|f/<«(E,fc)|2dE]- (47-30) 

n 

Formulas have now been developed for the determination of all the 
coefficients in Eq. (47*20) and out* study of the second-order correction 
to the wave function is complete. These second-order formulas are 
so complicated that they are seldom used and the corrections of the 
third and higher order are still more complex.^ Hence we carry the work 
no farther. 

47c. An Example ; The Diatomic Molecule. — We consider here only 
a very simple and elementary application of the above perturbation 
formulas to the dumbbell model of the diatomic molecule discussed in 
Sec. 28. 

The radial differential equation (28*19) can be written in the form 

|i + 

1 C/., however, K. F. NiassaN, Phy 9 , Rev. 84, 263 (1929) for energy formulas of 
third and fourth order. 



Sec. 47] PERTURBATIONS FOR NONDEGENERATE PROBLEMS 387 

The “centrifugal-force'^ term in this equation can be treated as a pertur- 
bation on the simplified equation 

ifo(R(0) = + F(r) (47-32) 

If the minimum of V(r) is deep, the lower energy levels of the unperturbed 
problem will be spaced at approximately equal intervals like those of an 
ideal linear oscillator and the corresponding wave functions will resemble 
those of a linear oscillator of suitable frequency with potential minimum 
at r = ro. 

Before applying perturbation methods, however, we must see whether 
they are really legitimate in this case. To this end we observe in the 
first place that the perturbing Hamiltonian operator 


introduces no new singular points into the problem. It vanishes rapidly 
at infinity, thus insuring that the perturbed wave functions will behave 
like the unperturbed at the outer singular point. The fact that Hi 
becomes infinite more rapidly than V{r) at r = 0 looks suspicious but 
leads to no real difficulty. It follows from the work of Sec. 28 that 
solutions of the unperturbed problem vanish as r to the first power at 
the origin while those of the perturbed problem vanish as It is 

easily shown that eigenfunctions of the interpolation problem vanish 
at the origin as a power 7 of r which reduces to unity, or Z + 1, when X 
is given the values 0 and 1, respectively. Thus the solutions of the 
interpolation problem pass continuously into solutions of the unper- 
turbed or p(^rturbed problems at the ends of the range 0 ^ X ^ 1. 
Finally the integrals which form the matrix elements of Hi exist despite 
the pole of Hi at the origin. 

We can now apply the elementary formula (47 9), denoting the mean 
value of the reciprocal of the moment of inertia over the ^;th unper- 
turbed radial eigenfunction by l/h- To first-order corrections the 
energy of the model is 


E - E{v,l) = E^:^ + 


Z (Z + \)h} 
St^Iv 


jl - 0, 1, 2, • • 

(t; = 0, 1, 2, • • j 


(47*33) 


This equation is seen to be in harmony with the approximate result 
(28*24) previously obtained by the method of Brillouin, Wentzel, and 
Kramers if we identify Iv with juro^ and note that the difference between 

( l\2 ^2 in ^ 

I + 2 ) is a constant which 

can be absorbed into E^J^K The perturbation-theory method here 



388 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


used is the simplest method of deriving the rotational term, but on the 
whole the B. W. K. method when used in higher approximation is perhaps 
the most satisfactory for dealing with this molecular problem.^ 


48. THE PERTURBATION THEORY FOR DEGENERATE PROBLEMS 



First-order Energy Perturbations. — The straightforward per- 
turbation theory for initially degenerate systems has a tendency to 
become complicated. The various schemes for reducing the complexities 
are best appreciated, however, after an introduction to the problem 

— such as that which follows — which makes no 
use of tricks. Discussion of the perturbation 
theory for degenerate problems is facilitated 
by reference to a graph of the interpolation 
energies E{\) plotted against X such as that 
shown in Fig. 19. The degree of degeneracy or 
statistical weight g of an energy level being 
^ ^ defined as the number of linearly independent 

Fig. 19.— The splitting of eigenfunctions for the given energy value, it 

energy levels by the applica- is evident a 'priori that this quantity must 
tion of a perturbation. jxrx xx ‘x i 

be independent of X except at points where 
two or more energy curves meet or cross. At such points the statisti(^al 
weight is equal to the sum of its values for the different curves which 
meet. If the symmetry of the perturbed problem is the same as that 
of the unperturbed, the degeneracy of each level of the interpolation 
problem will usually be the same as that of a corresponding unperturbed 
level. In this case the curves for the different interpolation levels will 
have no tendency to meet at the X = 0 axis. Very frequently, however, 
the perturbed and interpolation problems are less symmetrical than the 
unperturbed, and any given unperturbed energy value can be 

the starting point of several divergent .S,X curves. In such cases an 
arbitrary unperturbed wave function u{q) is not necessarily the limit as X 
approaches zero of some interpolation function ^(X,(?). Thus if \pa and 
^5 are two interpolation functions whose energy curves meet at X == 0, 
it is evident that any linear combination of ^a(0,g) and ^6(0, g) makes a 
possible unperturbed function although it will not be the limit of any 
^(X,g). Hence the first term in the expansion of ^ in powers of X cannot 
be assumed to be known even if the unperturbed problem has been fully 
treated. The determination of this first term, or zero-order approxima- 
tioTif is a process which must be carried along with the determination of 
the energy and the higher order terms in 


^ Cf. E. Fubs, Ann. d, Phys. 80, 367 (1926) for the classical application of the 
perturbation-theory method to this problem. The most complete treatment by 
means of the B. W. K. method has been given by J. L. Dunham (c/. footnote 1, p 
M7). 



Sec. 48] PERTURBATIONS FOR DEGENERATE PROBLEMS 


389 


Let us now suppose that we have given a complete orthonormal system 
of eigenfunctions of the unperturbed-problem Hamiltonian Ho. Each 
discrete function will be denoted by a symbol Uki, where the first sub- 
script, fc, indicates the energy level, while i indicates the individual 
member of the group of eigenfunctions which belong to this level. 
According to Sec. 32A; the discrete unperturbed levels have a finite 
degeneracy, so that the index i ranges from 1 to an upper limit gu which 
is the statistical weight of the energy level (To avoid an undue 

multiplicity of subscripts we shall at times omit the subscript on the 
symbol g when no ambiguity will result.) The continuous-spectrum 
wave functions must be provided with similar double subscripts E and i. 
Finally the interpolation wave functions will also be specified by two 
subscripts fc, i of which the first indicates the parent unperturbed energy 
level. 

Equation (47-7) is applicable in the present (*ase provided that addi- 
tional subscripts are introduced. Thus 

(7/o - (48-1) 

By the argument of the preceding section the left-hand member is orthog- 
onal to all eigenfun(?tions of the unperturbed equation belonging to 
the energy level Ef\ Hence it is orthogonal to Uk\^ * * * , uun and 
‘ right-hand member must have the same proper- 

ties. Forming the scalar product of the right-hand member with we 
obtain 

FJiJ) = (48-2) 

in analogy with (47-9). In this case, however, the equation is not so 
directly useful since the functions \l/lf are as yet unknown. 

The functions are of course expressible in the form 


'I'lV = Xu,jU^o\kj-,ki). (48-3) 

If the interpolation functions form an orthonormal set, as we shall 
suppose, the zero-order functions must do the same. Under these 
circumstances the coefficients of (48*3) are subject to the 

(Hpiations 


= bit-. 

»»1 


(48-4) 


In fact the set of coefficients for each unperturbed energy level form 
a finite unitary matrix and by adding zero elements connecting different 



390 PEBTURBA TIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


unperturbed energies we can form an infinite unitary step matrix 
like the matrix V of p. 363. Substituting the sum (48-3) into the right- 
hand member of Eq. (48-1) and setting the scalar products of the resulting 
expression and the successive functions Uki {I = 1,2, • • • ,gk) in turn 
equal to zero we obtain the set of gk equations 

XWiikj'M - EjJ^^dir]U^'»{kj-,ki) = 0, r = 1,2,3, ■ ■ ■ ,gk (48-5) 

in which Hi(kf;kj) is the matrix element 

Hiikf;kj) == J^Ukr*HiUkjdT = (HiUkjyUkj). (48-6) 

The above set of gk simultaneous linear homogeneous equations in 
the Qk unknowns U^^\kl;ki)yU^^^{k2\ki)j • • • , {kg k;ki) is {ormally the 

M p<o) p(o) 

t-O 



Fig. 20. — Diagram of the steps of Hi initially diagonal in k and to be completely diagonal- 
ized by canonical transformation with matrix 

same as the set of equations (44*24) or (44*30). Hence the theory of the 
eigenvalue-eigenfunction problem for finite matrices, as developed in 
Sec. 44c, is applicable. Nontrivial solutions can exist if, and only if, 
is a root of the secular equation obtained by setting the deter- 
minant of the coefficients equal to zero. Let us denote the gk X gu 
element Hermitian matrix \\H i{kf ;kj)\\y in which k is fixed, by 
(c/. Fig. 20). Then we can write the secular equation in the form 

det (<*>Hi - I) = 0. (48*7) 

This equation has gk real roots. If they' are all distinct, there are gk 
different initial slopes for the M versus X curves (Fig. 19) which originate 
at the fcth unperturbed level. Hence there must be gk nondegenerate 
interpolation levels which unite to form a single degenerate level 



Sec. 48] PERTURBATIONS FOR DEGENERATE PROl^LEMS 391 

when X is zero. At the other extreme is the case where the, off-diagonal 
elements of the matrix are zero while the diagonal elements are 
equal. In this case, (48*7) has a single gffc-fold root and the interpolation 
problem in this approximation has the same degeneracy as the unper- 
turbed problem. 

The set of gu unknowns U^^^{kj]ki) for each value of i constitutes a 

ve(;tor like the vectors ^ and x of Sec. 44c. It can always be normalized 
and will be automatically orthogonal to any similar vector derived from a 
different root of the secular equation. Whether the roots are distinct or 
multiple it is always possible to choose a set of eigenvectors, including 
p for each p-fold root, which are mutually orthogonal and unite to form 
a gk X gk element unitary matrix \\U^^\kj;ki)\\. 

48b. Second-order Energy Perturbations. — Our first-order energy 
and zero-order wave function calculation lias led us to a simple matrix 
eigenvalue-eigenvector problem. Before proceeding farther it will be 
useful to restate the entire perturbation problem in matrix form. For 
this purpose we identify the operator a of (44-23) with the Hamiltonian 
operator IL Let us at the same time generalize our previous treatment 
by assuming that ff is not necessarily linear in X but is representable by a 
power series. For our basic set of <p functions we take the unperturbed 
wave functions Uki. We write the corresponding initial matrix as 
H = Ho + XHi + X 2 H 2 + • * * and take Ho to be an ordered diagonal 
matrix. For the diagonal matrix A of Eq. (44-33) to be obtained by the 
transformation of H we introduce the symbol E since it is built up from 
the interpolation energy values. Thus (44-33) becomes 

(Ho + XHi + X^Ha + - : - )U = UE. (48-8) 

Following the procedure of Sec. 47 we seek a solution in power series; 

U = U«’> + + x*u<» + • • • , 

i = + XE”> + + • • • . 

Here E^^^^ E^^^ E^^)^ . . . qjq diagonal and E^®^ must contain the same 
diagonal elements as Ho. We assume that E^®^ is identical with Ho. 
This means that the numbering of the 4'^s is made to harmonize with that 
of the UkiS, Inserting the expansion into (48-8) and equating the 
coefficients of like powers of X to zero, we obtain the sequence of equations 

E(0)U(0) _ jji0)'£w ==, 0^ (48-10) 

E(o)U(i) = U^o)E(i) _ (48-11) 

E(o)U<2) - U^2)e(o) = u^o)E( 2) + U^i)E<i> - HiII(^> - H2U^°^ (48*12) 


If a unitary matrix 0 can be found which satisfies these equations, it will 
transform the Uki functions into the will transform the UkiS 

into the 's, etc. ' 




392 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 

In view of the unitary character of U it is necessary that the equations 

= I, (48-13) 

U(i)U^o)t + u^o)xj(i)t*= 0, (48-14) 

l[i2)jjio)i + u(o)u(2)t + = 0, (48-15) 


shall also be satisfied. 

Equations (48-10) and (48-13) reaffirm the conclusion reached on 
p. 391 that is a unitary step matrix with no nonvanishing elements 
connecting different unperturbed energy levels. Taking the ikj;ki) 
element of each side of (48- 11) for values of j ranging from 1 to gky we 
obtain the simultaneous equations (48.5) and thereby rederive the con- 
clusion that must diagonalize the step matrix obtained from Hi 
by striking out all elements connecting different unperturbed energies. 

The determination of and with the aid of (48-12) and the 
remaining elements of (48-11) leads to algebraic difficulties unless the 
minors happen to be diagonal. Hence it is convenient to apply 
to H a canonical transformation with a matrix V chosen like to 
diagonalize each We distinguish between V and because the 

latter is not yet fully determined unless all the roots of the secnilar 
equation (48-7) happen to be distinct. V is simply an arbitrary solution 
of (48-10) whose steps satisfy the additional equations 

(48-16) 

As a result of the transformation Hi is replaced by a new matrix 

Hi = V-^HiV 
• 

with every ^*^Hi diagonal. In place of Eqs. (48-8) to (4815) we have a 
formally identical set with U replaced by U = We specify the 

new set of basic wave functions 


9h 

Wki = ^UkjV{kj;kl) 

and the rows and columns of Hi by indices k and I instead of the old indices 
k and j,^ 

Let denote the coefficient of X” in the expansion of U in powers of 
X. In view of the discussion on p. 363 and the fact that both and V 
diagonalize every ^^^Hi, is readily seen to be a step matrix with no 
nonvanishing elements connecting different values of = E^^^ + 

If the unperturbed levels are actually split by the first-order energy 
calculation, the steps of 6^®^ will be smaller, in general, than those of 
Let us now apply the transformed set of Eqs. (48-12) to the deter- 
mination of the second-order energy perturbation Ej^K Every (kl;ki) 



Sec. 48] PERTURBATIONS FOR DEGENERATE PROBLEMS 


393 


element of the left-hand member vanishes automatically. Setting the 
corresponding elements of the right.-hand member equal to eero, and 
remembering that every is diagonal, we get 

- '%'^Hi(kl-,k'V)d<-'^'{k'V)ld) - iiklM'W^Kkl' ;ki) = 0. (48-17) 

k'y^k I' V 

Let US now choose I and i so that Eff and are equal, thus eliminating 
the second term of the above equation. For ¥ 7 ^ k, Eq. (48‘11) yields 

^n^ikT-,kl)l>'{kl-,ki) 

= -' £,«iZ-£,o, -- 

Substitution of this value of into (48-17) gives 

(>^ikl;ki)Eif - ^F{kl-,kl"W^'{kr-M) = 0, (48-18) 

V' 

with F{kl\kV') defined by 

2 ^Hi(kl;kT)Hi{k'V;kl’') 

F{km") = + fi,{kl-M"). (48-19) 

On account of the step form of the Z" sum in (48-18) is to be extended 
only over one of the small squares corresponding to a single pair of values 
of and E^^\ For any given i the index I has the same range as Z". 
Hence the number of Eqs. (48-18) for any given i is equal to the number of 
unknown elements of which appear and we can use these equations 
like (48-5) to determine the energy correction E^^>. In fact E^^ must 
be a root of the secular equation 

det ((^'>F - = 0, (48-20) 

where is a diagonal square of F corresponding to a single value of 

Usually the ultimate degeneracy of the perturbed problem is that 
which remains after the application of the first-order perturbation.^ 
When that is the case the eigenvalues of each of the small ^**^F matrices 
are all equal. In other words each of these small matrices is transformed 
by into a multiple of a unit matrix. It follows that those F matrices 
are initially multiples of unity and that every ^^*^F is initially diagonal. 

^ This will surely be the case if the initial degeneracjy is entirely removed by the 
first-order perturbation. Otherwise, we may know from a study of the integrals of 
the problem what the ultimate degeneracy is. Then, if the first-order perturbation 
splits each unperturbed level into the expected final number of sublevels, we can be 
sure that no further splitting will come from the higher order perturbations. 



394 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


Under these circumstances (48-18) must reduce to 

* 

Eur-' ^ F{kl-kl) = - ’ (48-21) 

and the off-diagonal elements F(kl;kV') need not be computed. It will 
be observed that (48*21) is essentially the same as (47*19) except that 
it is based on Hi instead of Hj. 

*48c. Van Vleck’s Method for Second-order Perturbations. — Unfortu- 
nately, it is often difficult, or impossible, to apply the transformation 
V to the whole of Hi, although there may be no difficulty in diagonalizing 
those steps of Hi whi(*h belong to the lower unperturbed energy lewels 
in which there is the greatest physical intert^st. This difficulty is the 
more serious because the upper, continuous portion of the energy spe(‘- 
trum can be brought within the scope of the present discrete-element 
matrix theory only by the introduction of an imaginary box surrounding 
the atomic system under consideration, or by some similar awkward 
device. It is therefore of interest to examine an ingenious procedure 
suggested by Van Vleck^ which permits the evaluation of second-order 
energy corrections for the lower energy levels without making the com- 
plete transformation V. 

The essential feature of the method is the introduction of a different 
canonical transformation — we shall refer to it as a Van Vleck trans- 
formation — with a matrix T(X) which eliminates the terms of the first 
order in X from the matrix components of Hi connecting any particular 
unperturbed energy with the other unperturbed levels. The Van 
Vleck transformation is relatively easy to apply and makes it unnecessary 
to compute the difficult elements of Hj, The transformation depends 
on the particular unperturbed energy level to be dealt with, but for 
simplicity we suppress the index k in the notation for the matrix T(X). 

Let us give T(X) the form^ 

T(X) - ^ I + ,*xs - ix^S^ - + • ' • , (48*22) 

where S is Hermitian. The adjoint of T is 

T(X)t = == I _ ,*xs - ix^S^ + + • • * . (48*23) 

As all powers of S are Hermitian, TTt = TtT = I, and T is automatically 

'The method was first used in a paper by J. H. Van Vleck on Sigma-type 
Doubling and Electron Spin/’ Phys. Rev, 83, 467 (1929) — see especially pp. 484 and 
485 — but is more fully described by Jordahl, a pupil of Van Vleck, in a paper on para- 
magnetic susceptibility, Phys, Rev, 46, 87 (1934). 

*The particular form (48-22) was suggested to the author by Dr. Bela Lengyel, 



Sec. 48] PEBTUBBAriUNti FOB DliGENEBATE PBOBLEMS 3fl5 

unitary. Let G(X) denote the transformed Hamiltonian defined explicitly 
by 

T->HT = G(X) = Go + XGi + + • • • . (48-24) 

Introducing (48-22), the equation for the determination of G takes the 
form 

(I + fxs - - • ■ - )(Go + XGi + X^Gj + - - - ) 

= (Ho + XHi + - - - )(I fXS - +■■■). (48-25) 

Equating coefiicients of like powers of X 

Go = Ho, (48-26) 

G, = Hi + j:(HoS - SGo), (48-27) 

Go == H» + zXHiS - SGi) - '2 (HoS'^ - S^Go), (48-28) 


Let us next choose S so that S(k"j',k'j') = 0 if k' k and k” 7 ^ k. 
Evidently this choice docs not affect the unitary character of T. It 
follows from (48-27) that under these circumstan(^cs Gj{kj;kj') and 
Gi{k"j"]k'j') are equal to IIi(kj-,kj') and H i{k''j" ;k'j'), respectively, 
provided neither k” nor k' is ecpial to k. Thus the transformation T 
to terms of the second order in X affects only that portion of H which 
coimects with other uni)erturl)(!d levels. We m^w introduce the 
requirement that elements of Gi of the form (ji{kj‘,k'j') with k 7 ^ k' 
shall all vanish. Equations (48-26) and (48-27) give 

Hi{kj',k'j') + ilHoikj'fkj) — II o{k'j' ',k'j')]S(kj',k'j') = 0, 


or 


S(kj;k'r) 


.Ih%j-,k'j') 


(48-29) 


It thus appears that our problem is solvable. Having determined 
S and G with the aid of (48-27) we can employ (48-28) and the equations 
of higher order to compute G 2 , etc. In particular we find after reduction 
that 


n - u fu- I.-M J. /4o.o(p 

Gt(kj‘,kj) — Uiikjikj) + - j^,oi _ jj(0) (48 80) 


k'f^h i" 


In order to compute the second-order energy correction we now apply 
the ipethod of Sec. 486, which goes through more simply than before. 
First we apply a canonical transformation with matrix V which diago- 
nalizes the minor ^*'Hi, t.c., **'Gi. Let G denote the transformed ma^x 
V-‘GV. Equations (48-10) to (48-12) are now applicable with Gj, 
and &2 substituted for Hi and Ha, respectively. As before, we use U 
to denote the transformation matrix required to diagonalize G. The 



396 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME |Chap. XI 

(kl;k'i) elements of (48-11) show that {kl;k'i) is now zero when 
k' ^ k. In place of (48-17) we have 

U^’>\kl;ki)Ei^ - U^^^ikl;ki)iEff - El^') - '^G,{kl-,kl")U^o\kl" ‘M) = 0. 

r 

(48-31) 

Choosing i and I to make and equal, we reduce the equations 
to the form of Eqs. (48*18) with G^Qd^kV^) substituted for F(kl;kl'^). 
The solution is to be carried through as before with the advantage that 
in this case the required elements of G 2 are obtained much more easily 
than the corresponding elements of Whereas in the former case 

we had to apply the transformation V to elemciiits of Hi connecting 
with all other unperturbed levels, in this case we have no cor- 
responding elements of Gi to transform and, in applying V, we can deal 
with a single small square of G which belongs to the energy level 

In case the higher order perturbations remove no degeneracy left 
by the first-order perturbation — i.e., in the case to which (48-21) is 
applicable — Eqs. (48-31) reduce to E^f = 6^{ki\ki). 

48d. Simplification of Perturbation Calculations by Means of Integrals 
of the Perturbed Hamiltonian. — Fortunately the complexities of the 
general perturbation method can usually be redu(*ed by proper use of the 
^'integrals of the motion,’’ f.c., the dynamical variables which commute 
with S (cf. Sec. 38c, p. 291). 

Let us assume that for all values of X, a and ff form a normal com- 
muting pair of dynamical variables (c/. Sec. 37c, p. 286) so that they 
have a complete system of simultaneous eigenfunctions. With these 
functions their matrices can be made simultaneously diagonal. We 
shall further suppose that a comple^te system of simultaneous eigenfunc- 
tions of a and the unperturbed Hamiltonian /fo is given. Under these 
circumstances it is convenient to seek solutions of the perturbed problem 
which are simultaneous eigenfunctions of a and H. We can then deal 
with one eigenvalue of a at a time as well as with one eigenvalue of Ho- 
Let denote such a simultaneous eigenfunction of a and ftj the a 
eigenvalue being a. Then is expressible as a linear combination of 
unperturbed functions for the same eigenvalue of a. All other unper- 
turbed functions can be ignored in working out and its H eigenvalue. 
Thus the original perturbation problem can be said to factor into a number 
of partial problems, one for each eigenvalue of a. 

Let Uknm denote a simultaneous eigenfunction of Ho, and a with the 
respective eigenvalues E^'* and an. m is a degeneracy index indicating 
the eigenvalues of additional variables which together with Ho and a 
make up a complete commuting set. Let Hi, H 2 , • • • denote the 
matrices of the corresponding operators derived from the basic functions 
Uknm* The matrices of Ho and a in this scheme are diagonal. We 



Sec. 48 ] PERTURBATIONS FOR DEGENERATE PROBLEMS 397 

denote them by and A, respectively. By hypothesis a commutes 
with /? for all values of X. Hence it commutes with Hi, • • • . 
Thus the diagonal matrix A commutes with Hi, H 2 , • * * , and con- 
sequently the latter matrices have no ('lements connecting different 
eigenvalues of a. In other words every term in the expansion of .H 
in powers of X is diagonal in the index n. 

Let the matrix U convert H to diagonal form at the same time keeping 
the matrix of a diagonal. By proper numbering of rows and columns 
we can insure that the matrix of a is actually invariant of the canonical 
transformation with the matrix tJ. But this means that U commutes 
with A and implies that U, and each term in its expansion, is diagonal in n. 
In Eqs. (48*8), (48*9), etc., we can now replace each matrix by a sub- 
matrix obtaiiKid from the original one by picking out sucdi elements as 
H ,n' ,m' ; and holding n' fast while the other indices range 

through all possible values. The whole perturbation calculation can 
then be carried through sc'parately for each value of n'. The advantage 
.is, of course, that by this ])rocedure wo reduce the order of the secular 
determinants to be solved and in fact greatly reduce the number of 
matrix elements to be dealt with in solving the complete problem. 

Evidently if there are several independent dynamical variables which 
commute with /7 we can use simultaneous eigenfunctions of all of them 
and so deal with a single set of eigenvalues at a time. It is desirable, if 
possible, to pick out independent integrals of the motion ai, a^, as, * * * 
which, together with H, form a complete normal commuting set. Their 
simultaneous eigenfunctions will then be uniquely defined except for 
phase factors. If both the unperturbed and perturbed (dgenfunctions 
are chosen to be simultaneous eigenfunctions of all these integrals, the 
secular determinants will be reduced to the low('st order jwssible. It 
is not necessarily i)ossible to reduce the order of the secular determinants 
to unity, however, for the reason that the dynamical variables ai, a 2 , 
as, * • • , whi(;h unite with ft to form a complete independent com- 
muting set, may not, and generally do not, combine with Ih to form a 
similar set. This is due to the fact that ordinarily //o has a higher 
symmetry than ft. If such were not the case, the unperturbed problem 
would be no simpler than the perturbed one. Consequently every 
dynamical variable which commutes with //o need not necessarily 
commute with ft. Moreover, dynamical variables which are inde- 
pendent of Ho may not be independent of ft, although they commute 
with both. , 

Let us assume that one or more additional independent dynamical 
variables /3i, /32, • * * must be added to the a^s in order to secure a 
complete independent commuting set Ho, ai, * • • , jSi, • • • for the 
unperturbed problem. Then either (a) all the commute with ft 
but are functions of and the a% or (b) one or more of the fail to 



398 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


commute with ff. In the former ca^e mc can evidently deal with .one 
set of eigenvalues of the a\s and the at a time in applying the per- 
turbation method — thus resolving the complete problem into a mul- 
tiplicity of partial problems, each having nondegenerate sets of unper- 
turbed eigenvalues. In this case the order of the secular determinants 
is reduced to unity and each partial problem can be dealt with by the 
methods of Sec. 47. On the other hand in case (5) each partial perturbation 
problem must necessarily involve a multiplicity of eigenvalues of those 
which do not commute with ff. The zero-order wave functions are 
then unknown linear combinations of a multiplicity of initial unper- 
turbed functions and the secular determinants (;annot all be reduced to 
the first order: Both cases are illustrated in the examples of Secs. 49 
and 50. 


49. THE ENERGY LEVELS OF AN HYDROGENIC ATOM IN A UNIFORM 
MAGNETIC FIELD (SPIN NEGLECTED) ' 

49a. Derivation of Hamiltonian Operator. — As an elementary 
example of the operation of the perturbation method for an initially 
degenerate system let us consider the effect of a (ionstant and uniform 
magnetic field on the energy levels and wave functions of an hydrogenic 
atom. Our conclusions would require revision if we were to take into 
account the electron spin, but the solution of the problem without the 
electron-spin terms is a useful exercise to carry through at this point. 
Let Ho denote the unperturbed Hamiltonian 


Ho = 


r 


(49-1) 


The complete Hamiltonian for an hydrogenic atom in an arbitrary external 


electromagnetic field with scalar potential 'i> and vector potential (t 
is [cf. Eq. (7-11)] 


" - ""+ ^c(«4 +«■!;+ “4) - ¥ + <* + is-J®!’- 


We assume a uniform magnetic field of strength 5C in the direction of the 
positive z axis. Then 


curl Ct = fc3C, (49*3) 

where A; is a unit vector along the z axis. Since the physical results of 
our calculation will depend only on the curl of the vector potential^ we 

may ehoose for a the following simple particular solution of Eq. (49*3) : 

ay--}i3Q,x; a* = 0. (49*4) 

1 Cf, L. Bbillouin, J. de Physique 8, 74 (1927). 

* Cf. discus&ion of gauge invariance,'* p. 29. 



Sec. 49] THE SniPLE ZEEMAN EFFECT 

ItitrodiKdng spherical coordinates, 


399 


^ ^"dy + ^‘dz ~ 2 




dy 


dx 


h 2 dip 


and 


l3|. . . K’,. 


Setting 4> equal to zero, the expression for II reduces to 

„ _ jr eX. h d , e^TC^ . 

^ %c 2wi d,p 'g/ica' 


(49-5) 


(49-6) 


(49-7) 


eX 
2/xc 

the standard form 


We identify 


with the parameter X of Sees. 47 and 48 and write // in 


with 


// = Hu + X//, + X2//2, 


(49-8) 


//i 


= 


h d 

2rjri dip 

Fh = pt.r^ mi^6/2. 


(49-9) 
(49- 10) 


For the magnetic fields obtainable experimentally the second-order 
term is relatively small and unimportant spectroscopically. This 
term — for atoms and molecules in general — is essential, however, to the 
theory of diamagnetism. 

49b. Legitimacy of the Perturbation Method. — ^The first question to 
l)e considered is the effect of the perturbative terms on the behavior of 
solutions of the Schrodinger equation at infinity. We know already 
that the unperturbed eigenfunctions having the standard form 






(4911) 


are simultaneous eigenfunctions of Hu and £*. It follows that the addi- 
tion of the operator or Hi to the unperturbed Hamiltonian leaves the 
behavior of solutions at the singular points unchanged. In fact the 
functions Unim are exact eigenfunctions of the Hamiltonian Ho + XHi. 

Hz increases the effective potential energy by an amount proportional 
to sin*0. Thus the limit of the effective potential energy as r increases 
indefinitely is changed from zero to plus infinity in every direction except 
along the z axis. The form of the continuous-spectrum solutions will be 
radically altered in consequence and the discrete energy levels will be 
raised. There is nothing in this perturbative term to destroy the 
discrete spectrum, however, or to prevent the continuous transition of 
discrete eigenfunctions of S into corresponding eigenfunctions of Ho 
as X approaches zero. We conclude that the application of the per- 



400 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


tiirbation method to the determination of the discrete eigenfunctions 
in the presence of the field is legitimate. 

49c. First-order Energy Correction; Relation to Magnetic Moment 
and Larmor Precession. — So long as is neglected, the problem of the 
influence of a uniform magnetic field on an hydrogenic atom is not one 
wliich requires perturbation methods. Since Unim is a simultaneous 
eigenfunction of Ho and //i it is an eigenfunction of Ho + XHi with 


eigenvalue 


Enljn — + 


\mh 


Nhc 


meh^ 


(49-12) 


Thus, neglecting the secondary effects due to the square of the magnetic 
field strength, the result of the application of the field is to split each 
unperturbed energy level El^'^ into 2n — 1 equally spaced sublevels 
symmetrically placed with respect to In these formulas e denotes 

the algebraic value of the electronic charge, and the energy perturbation 
is therefore positive when m, or the component of angular momentum 
in the direction of the field, is positive. 

From the expansions of Secs. 29d and 3(H‘ we know that the operators 
Hoj £z form a complete set of normally commuting ind(^pendent 
operators. The unperturbed problem has Coulomb degenc^racy super- 
imposed on the normal degeneracy of the rotation-reflection group 
(cf. Secs. 40e and/). On the other hand every integral of the perturbed 
Hamiltonian Ho + Hi must commute with as well as with //o. Rota- 
tions about axes other than the z axis are eliminated from the list of 
integrals of the Schrodinger equation. With them go the real observ- 
ables Rxi Ry As a result of a drastic reduction in the group 

of the Schrodinger equation every integral of the perturbed equation 
commutes with £*, and £* becomes a function of H within the discrete 
spectrum. Thus the perturbation removes the degeneracy with respect 
to the different eigenvalues of 

Comparing this concrete situation with the general discussion of 
Sec. 48d, p. 397, we note that the set of dynamical variables /3i, 182, • • * 
need contain but one member, which can be identified with £*, or with 
£», or Zy. If we identify /3 with we have an example of case (a) in 
which P commutes with B and the a*s but is not independent of them. 
Otherwise we have an example of case (6). The same theory applied to 
an atomic model in which an electron moves in a central force field 
which does not obey the inverse-square law goes through as before 
except that each energy level has a definite value of so that there is no 
dynamical variable of the type of the a^s of Sec. 48d. 

The spacing of the sublevels into which E^J^^ is split by the field is 

(49*12). Here the cofactor of Planck's constant h 
is the frequency px, of the classical Larmor precession in the, field JC. 



Sec. 49] 


THE SIMPLE ZEEMAN EFFECT 


401 


The classical vector magnetic moment of one or mor(^ spinless 
electrons in orbital motion about a center of force with the vector 


angular momentum £ is given by the formula 




2txc 


•ii = 


2mc 


(49-13) 


If the electron is under the influence of a magnetic field 5C in the direction 
of the z axis, the classical component of angular momentum along that 
axis is cons(‘rved and the mutual energy of the system and th(' field is 


-5C"jn£, or conformity with (49-12). 

The existence of a quantum-me(‘hanical Larmor jnecession of fre- 


quency 


V3C = 


47r;UC 


(49-14) 


in the (^ase of an hydrogenic atom with partially determinate azimutlial 
angle is readily derived from the above r(\sults. Since the value of ip 
for an assemblage of atoms with a unique value of Juz is completely 
indeterminate, a wave function which partially s])ecifies the instantaneous 
value of (p must involve a multiplicity of Jds and energy values. It 
follows from Eqs. (49-12) and (49-14) that 


2iri 


(49-15) 


Let us now compare the wave functions of two assemljlagos in the same 
state at ^ = 0, one of which moves under the influence of a magnetic 
field X, while the other is subject to no such field. If Eq. (49-15) gives 
the wave function of the former, 

'i'l = e '* V*’"*’ ( 49 - 16 ) 

T» I m 

describes the wave function of the latter. If we now refer the motion 
of the first assemblage to a set of coordinates r, ip' which }M-ecesses 
about the z axis with the frequency the two wave functions take 
identical form. Thus the motion of the systems which are subject to 
the magnetic field, when referred to rotating coordinates, is exactly the 
same as the motion of those not subject to the field when referred to 
fixed coordinates. This is the exact quantum-mechani(*al e([uivalent 
of the classical Larmor theorem and can be used to derive the latter 
in the limiting case of a sharply defined wave packet. 

In Sec. 55c it will be proved that the effect of a uniform magnetic 
field on the spectrum emitted by a hydrogenic-atom model with a spinles'? 
electron is also the same in quantum theory as in classical theory. 



402 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


49d, The Second-order Energy Correction. — In determining the 
second-order energy correction we have to apply Eqs. (48*18), (48*19), and 
(48*20). As the initial matrix Hi is diagonal, it is not necessary to apply 
the canonical transformation V which carries Hi into Hi. Equation 
(48*19) reduces to 

F{nlm)nVm') = H 2 {nlmynV mf) . (49*17) 

When the perturbing term is taken into account, the Coulomb 
degeneracy is lifted and ceases to be an integral of the motion. The 
only remaining integrals are /t*, and functions of the^m. Since 
and Rz commute, there is no degeneracy whatsoever in the discrete 
spectrum of the perturbed problem. Each perturbed level will have 
definite eigenvalues of and Rz which can V)e used along with the 
initial total quantum number in labeling. The unperturbed wave 
functions Unim are symmetric or antisymmetric with respect to the 
operator Rz (which merely replaces ^ by tt — 0) according as Z — m 
is even or odd. Thus the matrices of both these integrals of the per- 
turbed problem are initially diagonal and we are in a position to apply 
the procedure of Sec. 48d. The matrix U will have no elements con- 
necting different eigenvalues of and Rz and we can deal with one 
pair of eigenvalues of these operators at a time. (So far as the second- 
order correction is concerned the restriction to one eigenvalue of Stz at a 
time is already implied in Eqs. (48*19) and (48*20), since each different 
value of m gives a different first-order energy correction.) Every eleincmt 
U{nlm; nVm') vanishes imless m' = m and I — m and V — m are 
either both even, or both odd. Hence I — V must be an even number. 

Let denote the minor of H 2 composed of elements of the 

type H 2 {nlm; nVm) with fixed n and m and even or odd values of 
I and I' according as r is set equal to 4-1, or —1. The second-order 
energy corrections are then to be computed from the secular equation 

det ((”"*^>H 2 - = 0. (49*18) 

The required elements of H 2 are given by the formula 

RniRnvOimOvmr^ siu®^ drddd(p. (49*19) 

Introducing the notation 

p DO 

a(nl;nl') I RniJinvr*dr, 

QimQvmdO — — x^)Qim{x)Qi>m{x)dXj 

we reduce the above to 

Hi{nlm',nVm) = a(nl;nl')fi{lm\l'm). 

Since Qim{x) is independent of the sign of m, 

Ht(nlm‘,nl'm) == H2(nl — m; nV — m). 




= A I 
4xJ 


(49-20) 



Sec. 50] 


THE STARK EFFECT FOR HYDROGEN 


403 


The detailed calculation of these integrals would not be very difficult 
but is omitted in view of the fact that the energy corrections are so small 
that they have no great physical interests Suffice it to note that the 
diagonal elements of H 2 are all essentially positive; that for fixed values 
of I they increase rapidly with n and |m|; and that for fixed values of 
n and m they decTease as I increases. It follows from the invariance 
of the diagonal sum of the elements of a matrix with respect to a canonical 
transformation {cf. Sec. 446) that the energy corrections are positive 
on the average and increase with n and \m\. Actually the energy cor- 
rections are individually positive, for each is the mean value of a positive 
function of the coordinates for the correct zero-order wave functions. 
The secular determinants met with in computing the second-order 
energy (corrections for valiues of n up to and including n = 4 are never of 
order greater than 2. 

The matrix is now derivable from (48- 18) since in this case Hi is 
diagonal and yields no information. From the zero-order wave 
functions are readily derived. 

From the equation immediately preceding (48*18) and the diagonal 
character of Hi we see that in our present problem, all elements of 
connecting different unperturbed energies must vanish. Since U has 
no nonvanishing components connecting different values of m, the matrix 
has no such components. The only components which can differ 
from zero are consequently those which are diagonal in both n and m. 
Equation (48*17) yields no information concerning them. In fact 
these elements are arbitrary except for the limitation imposed by (48*14). 
(This arbitrariness parallels that previously noted in connection with the 
nondegenerate case (cf. Sec. 47a, p. 383.) Equation (48*14) is most 
simply satisfied by setting the remaining elements of equal to zero. 
Thus we can identify the first-order wave functions with those of zero 
order. 

60. THE ENERGY LEVELS OF AN HYDROGENIC ATOM IN A UNIFORM 

ELECTRIC FIELD 

The problem of the splitting of the lines in the spectrum of an hydro- 
genic atom by a uniform external electric field (Stark effect) was first 
solved by Schwarzscliild^ and Epstein® in terms of the Bohr theory. 

1 Cf. O. Halpbrn and Th. Sexl, Ann. der Physik 3, 565 (1929); also J. H. Van 
Vlbck, Electric and Magnetic Susceptibilities, pp. 178-220, Oxford, 1932, for other 
references and for general discussion of the relation of these energy correcjtions to the 
problem of diamagnetism. 

* K. ScHWARZscHiLD, Berliner SUzungsher., April, 1916, p. 548. 

® P. S. Epstein, Ann. d. Physik 60, 489 (1916). The application of the corre- 
spondence principle to the determination of the relative intensities of the lines in the 
pattern is due to H. A. Kramers, Danske Vidensk. Sdsk. Skrifter (8) III, 3, 287. and 
Zeits. /. Physik 3, 169 (1920). 



404 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


The agreement of their results with the experimental Stark-effe(?t patterns 
was one of the most important early successes of the Bohr theory. 
The same results to terms linear in the field strength were later derived 
by Schrodiiiger and by Epstein' from the standpoint of wave mechanics. 
Schrodinger solved the problem by separation of the variables in para- 
bolic coordinates and also gave a partial discussion by the perturbation 
method without resorting to parabolic coordinates. As an illustration 
of the method of Sec. 48 we here follow Schrodinger\s second procedure. 
The electron spin, which plays a secondary role for hydrogenic atoms 
in strong ele(^tric fields, is la^glected. For a discaission which takes spin 
into account the read(T is referred to the papers of Rojansky- and 
Schlapp.® 

Let the field be of strength 8 and directed along the positive z axis. 
We can assume without loss of generality that the electrostatic 
potential vanishes at the nucleus. The Hamiltonian of Eq. (49-2) 
reduces to 


77 = Hq — eZz = Ho + XHi, (50*1) 

with the conventions X = ~c8; II i = z. 

As in the problem of the external magnetic field all values of X have 
experimental significance and we need not distinguish between the 
int(^rpolation problem and the perturbed problem. 

Impli(*it in Eq. (50T) is the assumption that the electric field extends 
uniformly to infinity in all directions. Of (course this is never true expe^ri- 
mentally, but we know that the spectrum of an atom is empirically 
sensitive only to th(^ field in its immediate neighborhood and the above 
assumption is simpler from an analytic point of view than any other. 
In spite of its simplicity, however, the assumption leads to a mathe- 
matical difficulty in th(^ application of the perturbation method. The 
perturbative term in the Hamiltonian becomes negatively infinite if z 
approaches + » when X is positive, and if z approaches — oo when X 
is negative. In the region of large negative values of the perturbing 
potential the Laplacian of a solution of the first Schrodinger equation is 
necessarily negative and it is therefore impossible for ^ to satisfy the 
boundary condition of quadratic integrability. In fact the perturbative 
term \z in the Hamiltonian has the effect of destroying the discrete 
spectrum of the problem and substituting for it a continuous spectrum 
of energy eigenvalues ranging from — oo to + °o . It follows at once 
that the eigenfunctions and eigenvalues of the Hamiltonian H of Eq. 

1 E. ScHiiODiNGEK, Ann, d. Physik 80, 437 (1926); P. S. Epstein, Phys. Rev, 28, 
695 (1926). 

2 V. Rojansky, Phys. Rev. 33, 1 (1929). 

® R. ScHLApp, Proc. Roy. Soc. A119, 313 (1928). 



Sec. 50 ] 


. THE STARK EFFECT FOR HYDROGEN 


405 


(50*1) do not continuously approach the cip;onfunctions and eigenvalues 
of Hq as X approaches zero. Thus the fuiulameiital postulate of the 
perturbation method (cf. Sec. 47a) is violated. 

If the field is not too large, however, the situation bears a close 
resemblance to that of the Gamow-Gurney-Condou model for the alpha- 
particle disintegration of radioactive nu(‘lei described in Sec. 316. We 
have to do with a single particle under the influence of a potential V 
which is strongly negative in the neighborhood of the origin, and which 
either passes through a maximum as w(‘ move away from the origin along 
a radial line or else becomes infinite at infinity, depending on the direc- 
tion. For not too large negative energies there are two regions of positive 
kinetic energy separated by a potential barrier. As one region is finite 
and the other extends to infinity, the (conditions are right for weakly 
quantized energy levels.^ The (cxpccriiiKaital spectrum shows that these 
levels exist and apijroach the discrete levc^ls of //o as X approaches zero. 
Spontaneous ionization of atoms by leakage” through the potential 
barrier is a secondary phenomenon (cf. footnote 1, p. 181). 

The wave functions of weakly (luantized states are quadratically 
integrable solutions of the second Schrodinger ecpiation which are almost, 
but not quite, monochromatic. They an) acccordingly approximate 
solutions of the first (time-free) Schrodinger equation. Hence, if we 
seek to solve the latter (equation by succ(\ssive api)roximations, using 
the perturbation method on a problem in wliich there anc wc'akly (quan- 
tized energy lc v(ds but no rigorously discret(' energy eigen\’alues, it is 
to be expected that early terms of the seri(\s will d(cscribe th(‘ imperfectly 
quantized states although the higlnr approximations do not converge 
in the normal manner. This expectation is confiniKHi by the result of 
applying perturbation theory to the problem in hand as well as by a 
study of the eigenvalue-eigeuif unction qrobhem by means of the variation 
method described in Sec. 51. 

We proceed to the formal application of the procedure of Seec. 48. 
As in the magnetic case the group of the Schrodinger equation is reduced 
by the elimination of rotations about any but the z axis. and Rz 
also drop out of the list of integrals, but Rx and Ry remain. Since 
neither R^ nor Ry commutes with we infer that the j)erturbing energy 
does not completely remove the degeneracy of th(‘ unperturbed discrete 
energy levels. We use the same set of basic functions Unmi as in the 
discussion of the effect of a magnetic field in Sec. 49. There will then be 
no matrix components of Hi involving different values oi m and we can 
deal with one value of at a time. 

Since z r cos the general matrix element of //i is 

H i(ninl]n'mV) = RniRn'i'r^drJ^ cos $ sin 0 dff. (50*2) 

le/. C. Lanczos, Zeiie. /. Phj/stk 62, 518 ( 1930 ), 66, 431 ( 1930 ), 68, 204 ( 1931 ). 



406 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


The factor integral involving 6 can be written in the form 


cos 9 sin Odd = 

I" ^ + l)(2r + ! )(; - m)\iV - m) ! ] ^ 


i{i + m) !(r + m ) ! 


IH c+i 

\L' 


(50-3) 


if \v« assume m to be positive. Negative values of m give the same 
matrix elements as the correAsponding positive ones and it sufficoAS to 
deal explicitly with the latter. The matrix elements diagonal in n 
are also readily seen to be independent of the sign of V — Z. For first- 
order calculations we need only the elements diagonal in n and can 
therefore restrict ourselves to an explicit discusAsion of the elements for 
which Z' — Z is positive. By the recurrence formula (F-10) of Appendix 
F we have 

+ (50-4) 


This is different from zero only if Z' — Z = ±1. Thus all matrix elements 
vanish except those for which Z' — Z = ±1. It will suffice to compute 
elements of the form //i(n,m,Z; n,m,Z — 1). 

Setting Z' = Z — 1 in (50*4) and applying (F-15), we obtain 


j + l)C2i!'-'^i)(i? - w - 1)!‘ 

Equation (50-3) now reduces to the form 

GzmOi-i.m cos 6 sin 6 dS - (60*6) 

We turn next to the integral containing t!a* radial of the wave 
function. By (29-8) the normalized radial wave function Ls 


Rni{r) 


Rn-f 'l- W 

nao \ [(n + Z)!]‘‘ 




f 


where x — 2rZ/nao. Hence 


Bni{r)Rn.i-i{r)rHr = 

Jo (50-7) 

According to Eq. (G-13) of Appendix G the integral on ilie ri'^lit ha.s the 
value -6n| ^ + [(n + 1 ) !]». Thus 



Sec. 50] 


THE ETARK EFFECT FOR HYDROGEN 


407 


J Rni(r)Kn.i-i(r)r^dr = 

The complete matrix element is 


n,m,l — . 1 ) 


3 nao RP — ~ 

2T'V 4Z2 - 1 


(50*8) 


(50-9) 


Let denote the minor of \\Hi(nml; n'm'V)\\ obtained by setting 

n' equal to n, and m/ equal to m. The first-order energy corrections 
Eninn. for statcs with unperturbed energy and the quantum number 
m are the roots of the secular equation det(^"”'^Hi — = 0. If the 

rows and columns of are arranged similarly according to iiuu’easing 

values of I and all elements vanish except those in the two lines border- 
ing the principal diagonal. Thus the secular determinant has a simple 
form to which the term continuant has been applied. As an example 
we write down the explicit secular equation for n = 5 and m = 2: 




EO) 

(3i!^2)(52_3^) 

4 2 - 1 

0 






2 -22) (52 -3^ 
X 32 - 1 

J5(i) 

(42 -22) (52-42) 


0 


- P- 

'\1 4' 


4 X 42 


-22) (52 -42) 

I 

£-( 1 ) j 


= 0 . 


C has the value 


(5010) 

The general solution of these secular 


3 nao __ 3 ^ 

2~Z~ " 8 

equations was first worked out by Schlapp^ using a method describing 
in Appendix K. The roots of the general secular equation are 

E^^^ = C(n — I'm-l — 1), C(n — |m| — 3), • • • , — C(n — |m| — 1). 


Thus E^^^ is equal to C multiplied by an integer, say k, which ranges for 
fixed n and m from — (n — \m\ — 1) to -|-(n — \m\ — 1). However, 
different values of m give overlapping sets of energy values, so that, 
considering all values of rn simultaneously, we can write 


\E^^i = \Ck = fc = 0, ±1, ±2, • • • , ±(n - 1) (50-11) 


For each set of values of n, m, k there is just one wave function, but for 
given values of n and k the quantum number m takes on alternate 
integral values between the limits ±(n — 1A;| — 1). The multiplicity 
or the energy level EiSc^^^ = is accordingly n - \k\. 

Equation (50- 1 1; is the well-known formula for the linear Stark effect 
in hydrogen confirmed for moderate fields by numerous experiments. 

. ^ Loc, cit., footnote 3, p. 404. 



408 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


For fields greater than 100 kilovolts per centimeter the quadratic and 
cubic terms also produce an appreciable effect. The reader is referred 
to the original papers^ for a discussion of the theory of these higher order 
approximations. 


61. THE VARIATIONAL METHOD 

61a* Reduction of the Variational Problem to Algebraic Form. — An 

extremely valuable supplement to the power-series perturbation theory 
is to be found in tln^ so-called variational method. This method is based 
on the redu(*-tion of eigen value-eigcnfuiK^tion problems to variational 
form as demonstrated in See. 24. We there used the variational formula- 
tion of su(di problems as an aid in proving the completeness of the 
syst(‘m of eigenfunctions of a large class of Sturm-Liouville problems. 
Here we employ it as a guide in the approximate numerical comi)utation 
of discrete eigenvalues and their eigenfunctions. 

It follows from the work of Sec. 32c that if H denotes a normal 
Hamiltonian operator, Hermitian with respect to class D, and if we 
define Q, A, J by the equations 

QW = N[\p] ^ (^,^), (51 T) 

J[^P,E] ^Q- EN, (51-2) 

the problem of finding the class D eigenfunctions of H and their eigen- 
values is equivalent to that of solving the variational problem 


8J[i^,E] - 0 , 


(51*3) 


subject to class D boundary-continuity conditions. Here E is a param- 
eter which must be properly chosen in order that (51-3) shall have a 
solution. To every extremal of Eq. (51-3) there corresponds an 
eigenvalue Em of E such that J[\l/mjEm] ~ 0, or 


“ Nlrl^m] 


(51*4) 


Every such extremal \f/m is a class D eigenfunction of H with the eigen- 
value Emy and conversely every class D eigenfunction of H is a solution 
of the variational problem. The extremals of J are also the extremals 
of Q/N. 

We further assume the following propositions without rigorous proof. ^ 

a. The lowest eigenvalue is an absolute minimum value of Q/N for 
class D comparison functions. 

1 Quadratic Stark effect: P. S. Epstein, loc. cit.; G. Wentzel, Zeits. f. Physik 38, 
527 (1926); I. Waller, Zeits. f. Physik 38, 640 (1926); J. H. Van Vlbck, Proc. Nat. 
Acad. Set. 12, 662 (1926). Cubic Stark effect: Theory worked out by Doi and 
published by Ishida and Hiyama, Sci. Pap.^ Inst. Phys. Chem. Research, Tokyo 9 , 1 
(1928). 

* Reasons for these assumptions will be found in Secs* 25a and 32i. 



Sec. 51] 


THE VARIATIONAL METHOD 


409 


b. If a sequence of orthogonal eigenfunctions ^i, ^ 2 , ^ 3 , * * * with 
corresponding eigenvalues Ey, £ 2 , • • • is so numbered that Ek+i ^ Ek 

for all values of k, and if all discrete eigenfunctions are contained in the 
manifold of linear combinations of the it follows that N[ypk] 

is the absolute minimum of Q/N for comparison functions ^ of class D, 
subject to the restriction that 

‘ - 0 . ( 51 - 5 ) 

Let us reduce the variational problem to algebraic form by the intro- 
duction of a complete orthonormal system of functions ^ 2 , <^ 3 , * ‘ * 
all of which belong to the Hermitiaii domain II. Let ^{k) denote 
the kth Fourier coefficient of a class D function ^ with re\spect to this 

system, ^.e., ^{k) = We seek to determine the vector J 

f(2), * • * ) so that \l/ shall be an extremial of Jl\p,E]. Let H{ 7 )i^k) * 
denote the matrix element {TI(pk,<pm). From the Hermitian character 
of H and the completeness relation for the ip'n [c:f. Eq. (22*32)] we can 
derive the Fourier coefficients of Hyp with respect to the <^\s. Thus 

m m 

Using tlie completeness relation again, we derive 
Q[^\ = 

k k \_ m J 

Furthermore 

mi'] = ( 51 - 7 ) 

km k 

m,E] = QW - ENW = XXm*^im)[Hik,m) - ESU ( 51 - 8 ) 

k m 

Thus the integrals Q, N, J are reduced to Hermitian quadratic forms 
in the parameters ?( 1 ), ^(2). • • • . 

06 

The first variation of \p can now be written as 5^ = '^fpkd^ik), 

k = l 

where the 5{^s are arbitrary complex numbers. The first variation of 
the integral J is obtained by differentiating the corresponding quadratic 
form with respect to the f^s. We can throw it into the form 

m k m k 

( 61 - 9 ) 

If i is an extremal of J for arbitrary admissible variations Srp, 8J must 



410 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


vanish for arbitrary complex values of the Hence it is necessary 

that the coeflScient of every and every in (51*9) shall 

vanish, z.e., 

00 

^k{k)[H{m,k) - E&r.k] = 0. m = 1, 2, 3, • • • . (5M0, 

Conversely, it is easy to see (c/. p. 360) that a solution of Eqs. (51*10) 
for which the sums 5^|f(A;)|^ ^ are convergent deter- 

k km 

mines a corresponding function \l/{x) which belongs to the Hermitian 
domain of //, extremalizes J[i>jE], and is representable to an arbitrarily 
high degree of precision in the least-squares sense by a finite number 

. of terms of the series ^(Pki(k). The method would give discrete eigen- 

k 

functions outside class D if such existed. We assume, however, in 
accordance with the hypotheses of Sec. 32, that all discrete eigenvalues 
of H are equipped with class D eigenfunctions. 

Except for a minor difference of notation these new equations are 
identical with (44*30). Thus the variation principle leads us directly 
back to the basic relations of the matrix theory. 

61b. The Ritz Method. — The derivation of Eqs. (51*10) from the 
variation principle suggests the use of the Ritz method of successive 
approximations for their solution.^ This is a direct method of dealing 
with problems in the calculus of variations without the use of Euler’s 
differential equations. 

Let the problem to be solved be that of finding extremals of Q[^] 
subject to class D boundary-continuity conditions and to the requirement 
that the comparison functions shall be normalized to make iV[^] 1. 

The solutions will of course be normalized extremals of Q/N and J. 
Assuming that the first A; — 1 eigenvalues and their eigenfunctions 
have been found, we formulate the problem of finding the fcth eigenvalue 
Ek and its eigenfunction as that of minimizing Q[^] subject to the 
normalization condition and the orthogonality conditions (51*5). 

The essential feature of the method is the construction of a sequence 
of normalized comparison functions • • * conforming to 

(51*5) or having the property that lim = 0 (r — 1, 2, • • • , 

n— ► « 

k — 1), and of such a character that for every finite value of n, 

^ Ek, 

whereas lim = Ek- Such a succession of functions is called a 

n—* « 

^ W. Rm, J. f. reine u, angew. Math, 136, 1-61 (1909); Courant-Hilbert, 
p. 150. 



Sec. 51] 


THE VARIATIONAL METHOD 


411 


minimal sequence. When such a sequence is constructed one looks to 
the function = lim for a solution of the problem. The method is 

n—* 00 

not neeesHarily successful in every case, but if the limit function 
exists and is an admissible comparison function, we can ordinarily 
expect that it will be a genuine extremal. Clearly the chance that 
the minimal sequence will converge on a definite limit function will be 
greatly enhanced if we can formulate the minimum problem in such 
a fashion as to insure that it has a unique solution, z.e., that the 
eigenvalue Ek shall not have a multiplicity of linearly independent 
eigenfunctions which belong to the class of admissil)le (‘omparison 
functions for the minimum problem. Ordinarily the desired uniquenevss 
can be artificially created by imposing suitable symmetry conditions 
on the comparison functions of each of the minimum problems. 

Our procedure is l)ased on postulates a and b of pp. 408 and 409. Let us first 
consider the lowest energy level Ei. The orthogonality conditions (51*5) then drop 
out of the problciTi. 

In order to construct the desired minimal sequence we make use of the ortho- 
normal set of <p*8 introduced in Sec. 51rt. Let denote the arbitrary w-term linear 
combination 

n 

ii,(»)(x) (5M1) 

Let denote the special n-term linear combination 

n 

(5M2) 

for which r 7 i(fc) = (^i, ^0- The sequence 

v[^\x) v\^\x) vi^^(x) 

Vivitipi’ Vivfip’] 

is easily proved to be a minimal sequence for Ei. In order to establish this fact we 
make use of (51*6), identifying ^ with and replacing ^(/c) by r/d/c) throughout. 

The scalar products ^ff(k,m)((m) and are absolutely convergent. 

wi k 

It follows that for every positive e, however small, there exists a value of n such that 

n n 

n 

Nhli] - <e. 

k»l 


and 



412 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 
Hence 




X2 Hik,m)rii (m)r)i (k) * 

Q[^,] = +0(6), 

Jfc«l 


where 0(c) is a quantity of the order of magnitude of e. But it follows from (51 *6) 
that 


iVb'”)] 


n n 


A:=» 1 w = 1 


+ 0(c). 


^Vi(k)vt(k)* 

k^\ 


Therefore Urn 

n— » 00 




= Q\^i] 


= Eu 


We conclude that the functions 


fulfill the requirenumts for a normalized minimal sequence. (Bec.ause Ei is the 

r 1 

absolute minimum of Q[\l/\ the reciuirement 0 — ^ is automatically 

L J 

satisfied.) 

However, since we do not know we cannot actually set up this particailar 
seciuence of functions. Instead we introduce a different scMpience, whose typical 
member is 


n 

= %<Pk{xn'c\k\ 

A:* 

the coefficients being obtained by minimizing Q\w^”‘\x)] with respect to the 

parameters f(fc) subject to the normalization condition 

n 

Ar[u>(»>(i)] = = 1. 

If Mj“\a:) is constnicted in this manner, it follows that 



Consequently the uj^^^s must also form a normalized minimal sequence for Ey^ 

In order to find the coefficients {k) we nmke use of the relation 

n n 

= 2 S »(*.”»)«(’”)«(*)*• (51-13) 


To derive the minimum value of this expression subject to the normalization condition 



Sec. 51] 


THE VARIATIONAL METHOD 


413 


is equivalent to finding a normalized vector which minimizes 

n n 

A: — 1 w = 1 

Setting the derivatives of J[w^”'\E] with respect to the real and imaginary parts of each 
coefficient equal to zero, one obtains the finite; set of n eepjations in n unknowms, 

w 

X i(m)[H(k.,m) - ES^] =0. k = I, 2, ■■■ n (5M4) 

771 = 1 

Equations (51-10) form an obvious extrapolation of Eqs. (51-14) to the case where n 
is infinite. 

Equations (51 14) have a nontrivial solution if, and only if, is a root of the 
secular ecpiation 

det\\H{k,m) - = 0. (5M5) 

The extreme values of J are all zero {cf. Appendix h)). Let E^^ be a n)ot of (51 15) 
and let be the corresponding normalized solution of (51 14). Finally, let 

n 

t-I 

Then «[4 "VA^[i 4">1 = The minimum value of Q[u)<»>]/Ar[w(»)] 

must therefore b(; the lowest of the n roots of Eq. (51*15), which we call The 

corresponding sot of functions form the desired minimal seciuence, and it follows 
that lim E^""^ = Ei. 

W— > 00 

Wo havo found that the lowest eigenvalue of H is the limit as n 
approaches infinity of the lowest root of the nth-order secular equation 
(51-15). As the order n increavses, the root approaches Ei mono- 
tonically from above. This follows from the fact that the linear manifold 
Mn, defined by the first n includes Mn-i with the result that the 
minimum value of Q for comparison functions in Mn is less than or equal 
to the minimum value for comparison functions of Mn-i, Thus 

jg(n-l) ^ J^in) ;> ^(n+D > • • - > 

If the eigenvalue Ei is nondegenerate, the minimal sequence 

• • • can be so chosen that it converges in the mean-square 
sense upon an eigenfunction of Ei, To prove this statement, let us 
expand in terms of a complete orthonormal set of eigenfunctions of 
H, say ^ 1 , ^ 2 , * * * . (In order to make the formulas more transparent 
we write them in the form appropriate to the case where H has a purely 
discrete spectrum.) Thus, if is the Fourier coefficient we 

have 



414 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


= Ek(u{”'\^k) = Ekci’'\ 


= Ei^ 
k 

00 

Since = 1, we have 

Ar-l 

El = lim = ^i|l + lim 


*>1 


2(1 1 

J/ 


or 


lim 

n— ♦ « 


i(s - Oi '- 

ifc = 2 


= 0 . 


But since the lowest energy level is nondegenerate, 

\Ei y ^ El ^ 


Hence lim f = 0, and lim = 1. Let us next assume that 

n— » 00 I I n -4 «o » 

LA:-2 J 

is real and that in some portion of configuration space, say G, it is 
positive. We can choose also to be real and for sufficiently large n 
we can make u[”'^ positive in G, Under these circumstances it is clear 
that lim = 1, so that the Fourier coefficients of converge on 

W-*-> 00 

those of ^ 1 . 

Tfie norm of the function is now seen to converge upon the 

limit zero as n becomes infinite. Thus 


lim (^1 — u[^\ ypi — u[^^) = lim 

n—* oo n— ♦ 00 

= lim 

n—o 00 

This proves that the sequence of functions does actually converge 

in the root-mean-square sense on the eigenfunction For quantum- 
mechanical purposes this is all that we require. 

There is nothing in the above work which forbids us to impose 
additional restrictions on the comparison functions provided that the 
Hamiltonian H has eigenfunctions which satisfy these restrictions. 
Let us assume that H, together with the integrals ai, a 2 , * • ‘ , forms a 
complete set of commuting independent dynamical variables (cf. Sec. 37d). 
Then there will never be more than one simultaneous eigenfunction of H 
and the with a given set of eigenvalues. If we can find a complete 


|4n)|2 


+ 


.* = 2 

- 






- + 1 ] 


- 0 . 



Sec. 51] 


THE VARIATIONAL METHOD 


416 


system of functions ^ 2 , ‘ * • which are simultaneous eigenfunctions 
of the a^s, it is easy to deal with one set of eigenvalues of the a^s at a 
time by imposing on the comparison functions of the minimum problem 
from the beginning the requirement that they shall be eigenfunctions 
of the a^s with chosen eigenvalues. We then proceed as before using 
only those ^^s which conform to the new restrictions. In this way it is 
possible to compute to any desired approximation the lowest eigenvalue 
of H having given symmetry properties. The corresponding eigen- 
function is sure to be uniquely defined, exc^ept for a phase factor, so that 
a definite limit function can be attained. 

*61c. Higher Roots of the Secular Equation. — The question now 
arises whether we cannot derive the higher eigenvalues and their eigen- 
functions by a practical modification of the same procedure. The answer 
is that we have only to use the higher roots of the secular equation 
(5T15).’ The kth root of that equation when the roots are arranged in 
ascending order of magnitude (/-fold roots appearing / times) approaches 
the corresponding eigenvalue of H as a limit when n goes to infinity, and 
does so monotonically from above as in the preceding case. 

It will be sufficient to consider the evaluation of the second eigenvahie E^ with its 
eigenfunction ^ 2 . Let denote the minimum value of (3(v5') for all normalized 
comparison functions of the manifold Mn which are orthogonal to V'l — as required by 
Eq. (51-5). Clearly the sequence of values of approaches E 2 monotonically 
from above when n —* 00 — we can use the same arguments as in the preceding discus- 
sion of E[**\ The corresponding root of the secular equation (61 *15), wz., E[^\ is 
easily shown to be the minimum value of Q[\I] for all normalized comparison functions 
which belong to Mn and are at the same time orthogonal to We assume that 

E[”’^ is a simple root of the secular equation for all values of n and that in consequence 
approaches as a limit when n becomes infinite. (Difficulties arising in the 
multiple-root case are believed to be purely formal.) Hence 

lim = lini = 0. 

n— » « n— ♦ w 

Thus the orthogonality condition used for approaches that for as n becomes 

very large and we can infer that lim E^*"^ = lim = Ez, It is a known theorem® 

n— ► «o n— > 00 

1 Cf. E. A. Hyllbraas and B. Undhbim, Z^ts. f. Physik 66, 759 (1930). 

* Cf. CouRANT- H ilbert, M.M.P.f Kap 1, §3, pp. 20-24. It will be observed that 
the eigenvectors can be used to form the columns of a unitary matrix U which 
transforms the n Xn element matrix 

11 ^( 1 , 1 ) . . . H{hn) 




||iy(n,l) . . • H(n,n) 

to diagonal form and reduces to a sum of squares. The eigenvalues are 

the corresponding diagonal elements of the transformed matrix. 

* Cf. Ck)trRA]srT-HiLBBRT, M.M.P.t Kap. I, §4, 1, pp. 20-29. 



416 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 
that is the maximum of all minima obtained by minimizing subject to 

n — ►_» 

5) ~ 1 orthogonality condition of the form QJi) == 0. It follows 

m* 1 

that is always greater than or equal to Hylleraas and IJndheim^ have 

proved that (more generally, approaches its limit monotonically from 

above. Similarly it is easy to show that each of the roots of the secular equation 
(51-15) yields a minimal sequence u[^\ 1 /^*^ • • ■ . In general each of these 
sequences yields an eigenvalue and eigenfunction. But if the number of discrete ortho- 
gonal eigenfunctions of the operator H is finite, say p, we may expect that the minimal 
sequence for a root E^l"^ for which k > p will not converge on any limit function. 

51d. General Observations Regarding the Use of the Variational 
Method. — In practice the ultimate rigorous convergence of the successive 
Ritz approximations for Ek and \l/k is of no great importance. General 
formulas for and are not ordinarily obtainable and the work of 
evaluating and increases rapidly with n. Hence the theoretical 
physicist must ordinarily content himself with the numeri(;al computation 
of approximations for relatively small values of n. The success of such 
calculations, like those made by the perturbation method, rests on a 
happy choice of the basic sequence of functions (^i, (^ 2 , <^ 3 , * * • . 

The fact that Ei^'^ approaches Ek monotonically from above is of the 
greatest importance in the location of computational errors and especially 
in the comparison of computed and experimental energy values. If a 
value computed by the above outlined variational method lies below the 
experimental one, it follows that there has been a mistake in the com- 
putation, or that the calculation was made on an inadequate, or incorrect, 
basis. 

It is evident that the convergence of the sequence of Ritz approxima- 
tions will be most rapid when the basic functions (^ 2 , • * * are 

themselves approximate eigenfunctions of II, perhaps obtained by the 
exact solution of the eigenvalue-eigenfunction problem for a simplified 
operator Ho not very different from H, In this case we can employ 
either the Ritz method or the perturbation method and a comparison 
of the two schemes is illuminating. Consider, for example, a case in 
which the operator Ho has degenerate energy levels (c/. Sec. 48). Inspec- 
tion of the perturbation scheme for computing the first-order energy 
corrections and the zero-order wave functions shows that it is equivalent 
to the following. Let the unperturbed level E^l^ have g^-fold degeneracy 
and let and denote the ^ X g element matrices obtained by 
picking out those elements of H and Ho which involve only the unper- 
turbed wave functions Ukj of the level E^^\ Thus 

^^^H(m,n) = Jukm*HukndT, m,n = 1, 2, • • • , gf (51*16) 


^Loc. cU,, footnote 1, p. 415. 



Sec. 51] 


THE VARIATIONAL METHOD 


417 


Let The first-order energy corrections 

are the eigenvalues of and the total perturbed energies to first-order 
corrections (setting the parameter X of Sec. 48 equal to unity) are the 
eigenvalues of The zero-order wave functions are linear combina- 

tions of the g functions Uk\y Uk 2 y * * * , Uka derivable from the normalized 
eigenvectors of or These two matrices have the same eigen- 

vectors since their diffeirence is a multiple of the unit matrix L 
Thus in this approximation the perturbation theory gives the same 
eigenvalues and \ eigenfunctions as would be obtained by applying the 
variational method to the location of the extremals of Q[\l/] for normalized 

functions of the form 

I 

u 

'P = '^Ukiii. 
j = i 

It will be observed, however, that in mch an application of the variational 
method the mth root of the secular (equation does not give a good approxi- 
mation of the mth eigenvalue of H. On the contrary it gives an approxi- 
mation of the mth sublevel of the parent unperturbed level Ek^. 

We may con(;lude that, although — in the notation of p. 415 — 

lim E^^'^ = Ek, lim 

n--+ w «.“♦ 00 

if we break off computations with some finite value of n the final approxi- 
mation may resemble \l/k^T{r > 0) much more closely than Under 
these circumstances E^l'^ will approximate Ek+r rather than Ek. Thus 
the interpretation of the results obtained when we compute a Ritz 
approximation with a finit(5 number of basic functions <^i, ^ 2 , ' ‘ , <^n 

depends essentially on our choice of these functions. If any of the ^^s, 
say (fk, approximates from the beginning one of the eigenfunctions of 
say we may expect that one of the will also approximate ypu 

The corresponding eigenvalue of will then approximate Ei. It 

will be evident to the reader that, powerful as this method is, it must be 
used with judgment and discrimination. Fortunately the experimental 
energies are usually available as a guide. 

In the event that the basic set of consists of the eigenfunctions of 
an unperturbed operator J?o it is important to know a priori just which 
members of the set are most likely to contribute largely to any particular 
exact eigenfunction of the perturbed operator H. A partial answer to 
thi^ question is given by the standard perturbation theory of Secs. 47 
and 48 which shows that, if the perturbed energy value Eki originates 
in the unperturbed eigenvalue E^^\ the most important terms in the 
expansion 




(5117) 



418 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


for small values of X are, (a) those involving the unperturbed wave 
functions Ukj of and, (6) those involving functions Unj for which 

— Ei^^ is small. This statement ignores the differences in the 
magnitude of the different components of H, but to take these into 
account usually accentuates somewhat the relative importance of the 
small values of Thus, in general, II(nj;km) is small 

when is large, and, according to Eq. (48*19), this effect 

works in such a sense as to reduce the importance of terms which are 
already small because of the factor — E^^^) in , the denominator. 
The reader should bear in mind, however, that although the individual 
contributions to the energy of terms with large \Ef'^ — is apt to 
be small, the number of such terms is very large, so that their importance 
in the aggregate may be considerable. 

61e. Modifications of the Method; Construction of Eigenfunctions 
from Non-orthogonal System. — Hitherto we have used the variational 
method only as a means of finding the best approximate eigenfunctions 
of H which are linear combinations of a finite^ number of terms of a 
basic normal orthogonal set. The requirement that the basic functions 
shall be normalized and mutually orthogonal is not essential, however, 
and in many cases proves inconvenient. If this requirement is dropped 
and the argument is carried through as before, one finds that Q[w^^^] is 
given by (51*13) as before, but that 

n n 

m ** 1 A: “ 1 

where <r(m,/(;) = ((Pky^Pm)- Instead of Eqs. ^51*14) and (51T5) one obtains 

n 

^k{k)[H{myk) — Ea(myk)] =0, m = 1, 2, 3, * * * , n (51*18) 

A: = l 

and 

detl|//(m,fc) — E (T{myk)\\ = 0, (51*19) 

respectively. These equations are not quite so easy to work with, but 
the degree of the secular equation in the unknown E is unaltered. 

A still wider extension of the fundamental idea of the variational 
method is often used. Instead of finding the extreme values of Q[\l/] 
for a class of comparison functions that contain certain parameters — the 
f^s — linearly, we can introduce a flexible ^ function involving nonlinear 
parameters Xi, X 2 , * * * , X,,, evaluate Q as a function of these parameters, 
and determine its minimum or extreme values. This procedure leads 
to a set of nonlinear equations for the X^s, m., 

^0(^i> * * * , Xn) == 0, fc = 1, 2, • • • , w 
which must ordinarily be solved by cut-and-try methods. 



Sec. 62 ] 


THE HYDROGEN MOLECULE 


419 


Usually the number of nonlinear parameters which can be intro- 
duced to advantage is small and hence it is a common procedure to 
evaluate Q only for a few explicit sets of X values, resorting to inter- 
polation in order to locate the minimum. In the a])sence of a convenient 
scheme for orthogonalizing the comparison functions for the relative 
minimum problem associated with an upper energy level with respect 
to the eigenfunctions of lower levels, the use of nonlinear parameters 
is ordinarily restricted to the location of the lowest energy eigenvalue 
of a given symmetry tyjie. 

62. THE PROBLEM OF THE HYDROGEN MOLECULE 

52a. The Fixed -nuclei Problem. — As an example of the application 
of the variational method to the approximate evaluation of energy levels, 
let us (jonsider the simplest of molecular problems, viz., that of the 
hydrogen molecuh'. 

Owing to the large ratio of the nuclear mass to the electronic mass it 
is possible in first approximation to resolve the eigenvalue-eigenfunction 
problem for a diatomic molecule into two parts. ^ In the first stage of the 
computation one tr/^ats the nuclei as fixed centers of force and evaluates 
the eig(mvalues and eigenfunctions of the simplified Hamiltonian 
for the electronic motion in the presence of the stationary nuclei. The 
energies derived in this way arc functions of the internuclear distance r 
which also appears as a parameter in the electronic wave function 
In the second stage of the computation one evaluates the energies 
and wave functions of a two-particle nuclear problem in which 
E^^\r) plays the part of potential energy" and a term is added for the 
gyroscopic effect of the electronic angular momentum if necessary. 
Each electronic state, i.e., each when properly defined, yields its 
own nuclear problem and a corresponding complete system of nuclear 
wave functions. If any particular electronic state A has no component 
of angular momentum along the internuclear axis, the corresponding 
nuclear problem is non-gyroscopic and reduces to the problem of the 
‘dumbbell” model discussed in Secs. 2%h and 47c. An approximate 
complete eigenfunction of any definite state of electronic and nuclear 
motion is of the form 

We shall interest ourselves here in the first stage of the molecular 
problem, viz., with the evaluation of the energies and wave functions for 

1 P'or an elementary classical discussion of this resolution of the molecular problem, 
cf. E. C. Kemble, Report on Molecular Spectra in Gases,” Nat Research Council, 
1926, p. 293. The quantuni-inecihanical justification was first given by Born and 
Oppenheimer, Ann. d. Physik 84, 457 (1927). Cf. also R, de L. Kronig, Zeits. f. 
Physik 60, 347 (1928); J. H. Van Vleck, Phys. Rev. 33, 467 (1929). 

* We assume that the Coulomb potential of the bare nuclei is included in the 
Hamiltonian of the first-stage electronic problem. 



420 PERT'VRBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


fixed nuclei. In this stage lies the problem of the chemical union of the 
atoms. If. such union is possible for ajiy given electronic state, the 
function should have the form of the curve for V{r) shown in 

Fig.’ 10, and the dissociation energy Z>, aside from a small correction for 
the minimum energy of nuclear vibration, should be equal to the difference 
j5|^(c)(oo) — E^^^(ro). The moment of inertia of the molecule in a vibra- 

tionless state and its classical frequency of 
vibration are also derivable from the graph 
of E^^\r) when once that function has been 
3 0 evaluated. 

In the case of molecular hydrogen we have 
two nuclei which we designate as A and B and 
two electrons which we designate by the 
numbers 1 and 2 (c/. Fig. 21). Let ai and h\ 
Fig. 21.— Coordinates for the denote the distances of electron 1 from the 
hydrogen-moiecuio problem, nuclei A and i^, respectively. Let a< 2 . and 
denote the corresponding distances for ehictron 2. Let r and .s denote the 
internuclear and interelectronic distances, respectively. The basic differ- 
ential equation for the fixed-nuclei probi(‘m is conveniently expressed in 
terms of Hart^ee^s atomic units' in which it takes the form 



H* - -i(v,- + V.-)* + (i + i - - I - 5)# - fi-V, 

(52-1) 

if we omit the electron-spin terms as usual. In the future we shall dis- 
card the superscript and designate the energy in this problem by E. 

62b. The Heitler and London Calculation. — The first attack on the 
solution of the above equation was made by Heitler and London^ who 
saw in the known wave functions of the separate atoms a means for 
approximating the wave function of the molecule. If the nuclei are 
far apart, it is not difficult to see that we shall have a good approximate 
solution of the equation if we write yp = where /i is an atomic wave 

* D. R. Hartree, Proc. Cambridge Phil. Soc. 24, 89 (1928). In this system the 
fundamental units are as follows: 

Unit of length = uo = //*/47r*/xt’^-radius of innermost Bohr orbit for the hydrogen 
atom. 

Unit of mass *= ju = mass of an electron. 

Unit of charge = e == charge of an electron. 

Unit of time = 

Unit of action = h/27r. 

Unit of energy = c*/ao *= 2 X (ionization energy of normal hydrogen atom). 

The reader is referred to Condon and Shortley, Theory of Atomic Spectray Cam- 
bridge and New York, 1935, for a discussion of these units. 

* W. Heitler and F. London, Zeits. f. Phystk 44, 455 (1927). 



Sec. 52] 


THE HYDROGEN MOLECULE 


421 


function for electron 1 on nucleas A, and is an atomic wave function 
for electron 2 on nucleus B. In this case, 

-Gv2= + ^92 = E^gi, 

where Ei and E 2 are atomic energy-values. It follows from these rela- 
tions that 

1 // - (£. + E.)!/,,. - (1 + i - i - 

If r is much larger than the eff('ctive radius of tlie hydrogen atom in either 
of the states under (consideration, the (quantity + i " ^ ~ f) 
will always be very small when fig^ is relatively large and vice versa. 
Hence the integrals Jji*g 2 *[H — {Ex + Ei)]fxg^T and 

JJ[// - {E, + E,)]Sxg2^dr 

will be very small and fig^ must be a good approximate solution of 
(52*1) for the (mergy value E Ei + £^ 2 . 

The function is an eigenfuri(ction of the operator 

Ho = — o(vr + V2O jr’ 

and it would be possible to expand an arbitrary solution of the molecular 
problem in ternas of the complete system of eigenfunctions of this opera 
tor. Thus the molecular problem could Ix' dealt with by the conven- 
tional perturbation theory of Secs. 47 and 48. It is easy to see, however, 
that this mode of attack is entirely impracticable. Consider the function 
f 2 gi which is identical with fig 2 except that it places electron 2 on A 
in the state previously occupied by 1 and the electron 1 on B in the state 
previously occupied by 2. f 2 gi is an eigenfunction of the operator 

ffo' « '~k(Vi^ + V 2 ^) — — A, and as good an approximate solution 
-6 (I 2 b\ 

of (52T) as /igr 2 . Inspection shows, however, that this new function 
f 2 gi is no good at all as an approximate eigenfunction of Ih- This 
means that expansions of the eigenfunctions of H in terms of those of 
Ho will not ordinarily converge with any reasonable degree of rapidity, 
and that the perturbation theory of Sec. 48 with Ho as the unperturbed 
operator will not be useful. 



422 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 


We may reasonably assume, however, that fairly good approximations 
of the low-energy eigenfunctions of molecular hydrogen can be built 
up out of*the low-energy eigenfunctions of the two symmetrical operators 
Ho and //o'. This is the essential feature of the Heitler and London 
method. The details are most readily understood as an example of the 
variational method, though that was not the point of view of the original 
work. Since there is a wide gap between the energy of the normal 
state of the hydrogen atom and the energy of the first excited state, 
it seemed reasonable to use only the normal states of the two atoms as 
material for the construction of the normal state of the molecule. We 
accordingly introduce the functions 

in which /i, / 2 , gi, (j 2 are now identified with normalized normal-state 
atomic eigenfunctions. In Hartree atomic units these are (cf. Sec. 29c) 

fi = 7, gi = 

/2 = X" 02 = 


The functions <pi and ip 2 are not quite orthogonal and hence we must 
use Eqs. (51*18) and (51*19) in applying the vwiational method. 

Let S denote the s(^alar product of <pi and (p 2 . Since these functions 
are both real, S is real. Let 1 1 and 1 2 denote the integrals 

-i- 

Here dr is the element of volume in the six-dimensional coordinate space. 
Let Eo denote the ionization energy of the normal hydrogen atom with the 
numerical value in the Hartree units. Then 


= H{2,2) = 2Eo -\A + h, 

HO.,2) = 77(2,1) = (2J?o + + h, 

<r(l,l) = <7(2,2) = 1, 

,7(1,2) = <7(2,1) S. 

The secular equation (51 19) becomes 

H(l,l) - E H{\,2) -ES - 
H(l,2) - ES H{X,l) -E “ 

and has the solutions 

, _ H(l,l) + 77(1,2) _ , 1 , Zi + J, 

E j-p-g 2Eo + - + 

„ _ F(l,l) - ff(l,2) , h-h 

E 2E, + - + -j-^. 


(62-2) 

(62-3) 



Sec. 52] 


THE HYDROGEN MOLECULE 


423 


Let the corresponding relative extremals of Q be 
and 

r - ?'iVi + 

respectively. From Eqs. (51-18) we readily derive 
The normalized solutions are accordingly 

= [2(1 + S)rH<Pi + ^2), r = [2(1 - S)]-^^(^1 - ^2). (52-4) 

7i is called a “Coulomb’’ integral as it gives the mean value of the 
Coulomb energy - — ^ ^ for the state <pi. Because the coordinates 
of the two electrons are ex(;hanged when we pass from one of the two 



P'lG. 22. — Theoretical potential-energy (jurves for the lowest states of the Ha molecule 
compared with a Morse-type curve fitted to the empirical data. 

product wave functions which appear in the integrand to the other, 1 2 is 
called an “exchange ” integral. The relation of the functions to the 
Pauli principle, and the interpretation of the energy difference J5' — 
approximately determined by /2, as a by-product of a resonance phe- 
nomenon involving the exchange of electrons between the nuclei, is 
discussed in Sec. 656. 

The integrals aS, Ji, and /2, are, of course, six-dimensional and their 
evaluation is somewhat troublesome. It has been carried through, 
however, in part by Heitler and London, and in part by Sugiura^ to 
whose paper the reader is referred for the explicit formulas. Figure 22 
shows the curves for E' and plotted against the nuclear separation r 
and for comparison an approximate curve for the apparent potential 
energy of the hydrogen molecule in its normal state computed backward 
from the experimental band-spectrum data.^ From the graph it will 

1 Y. SuGiUBA, Zeits, /. Pkysik 45 , 484 (1927). 

- Cf. R. S. MtHLLiKBN, Rev. Mod. Phys. 4 , 73-78 (1932). 



424 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XT 

be seen that the curve for E' is in reasonable harmony with the empirical 
data considering the roughness of the (jalculation, while the E" curve 
represents a type of interatomic interaction in which the forc^e between 
the atoms is apparently repulsive at all distances.^ The existence 
of such repulsive states for potentially molecule-forming atom pairs 
was unsuspected before the theoretical work of Heitler and London but is 
now known to play an essential part in the production of the continuous 
spectrum of f /2 and in related phenomena. 

Instead of deriving the minimizing functions and their energies 

directly from the n on-orthogonal initial api)roximations ^ 1 , (^2 by means of 
Eqs. (51T8) and (51T9), we might equally well have used (pi, (p 2 to build 
up a pair of orthogonal initial apinoxirnations which were eigenfunctions 
of the symmetry integrals of the problem. 

Let us identify the internuclear axis with the z axis of a Cartesian- 
coordinate system and locate the xy plane halfway between the two 
nuclei. The system has an axial symmetry similar to that met with in 
Secs. 49 and 50. The group of the Schrodinger equation includes rota- 
tions about the z axis, the three reflection operators Rx, Ry, Rz, and the 
permutation operator P which performs the transformation 

Xi = X 2 , X^ —^X2 = Xi, 

2/1 yi = 2 / 2 , 2/2 -> = Vu 

Z \ — > == ^2, 2*2 — > ^2 ^ 

All integrals are dependent on £*, Kx, Ry, Rz and P. But the only non- 

commuting pairs in this set are <£*, Rx and Ry, It follows that H 
combines with each of the three dynamical variables JEr, Rx, Ry to form 
a complete set of commuting independent dynamical variables. Since 
Rx has but two eigenvalues, the maximum statistical w(aght of an energy 
level is 2. 

The dynamical variables P, K are all functions of H since 
each cominutes with every other integral of H. It follows that every 
discrete energy level of Eq. (52T) is correlated with a definite eigenvalue 
of each of the above observables. The eigenvalues of P, K are used 
in the classification of the electronic states of molecular hydrogen. 
Thus states having the eigenvalues 0, • • • for £ 2 ^ 

are designated as 2, n, A, • • * states respectively. States having the 
eigenvalues +1, —1 for X are designated as even and odd states. States 
having the eigenvalues +1, —1 for P (c/. Sec. 40d) are said to be sym- 
metrical and antisymmetrical, respectively, with respect to an inter- 
change of electrons. When the spin is taken into account each of the 

^ Actually, there is a weak attraction at large distances which may be identified 
with the Van der Waals force of classical gas theory. This is derivable by higher order 
approximations. The corresponding minimum of is too shallow to permit the 
formation of molecules of normal stability. 



Sec. 52] 


THE HYDROGEN MOLECULE 


425 


antisymmetrical states splits into three substates, whereas no such 
splitting occurs in the case of the symmetrical states. Consequently 
the symmetrical states are called ‘‘singlets” and the antisymmetrical 
ones “triplets.” The symbol denotes an even (gerade) triplet state 

for which while denotes an odd (ungerade) singlet state 

for which £>z^ = 0. 

In seeking solutions of Eq. (52*1) we can always deal with eigen- 
functions of a specific set of eigenvalues of P, and K. The functions 
(Pi and (P 2 are both eigenfunctions of with the eigenvalue 0. Hence 
any linear combination will be the same. On the other hand, for these 
very simple fornLs the operators P and K are equivalent and yield 

K<pi = K(P2 — (pl> 

It follows that 

^(<^1 + < 1 ^ 2 ) == V’l + <1^2, 

K{(pi — (P 2 ) — —{(pi — (p^)- 

Thus the sum and difference of <pi and ^2 are simultaneous eigenfunctions 
of Ky P as desired. Incidentally they are orthogonal. Normaliza- 
tion yields and \p" as the only suitably symmetrized approximate 
solutions of (52T) which can be built up out of the atomic wave functions 
/ and g. The appli(;ation of the variational method to these two functions 
then reduces to the evaluation of the corresponding energies: 

E' — Ji/'H\l/^dT = (P2 )H{(Pi + (p2)dr 

K" . jrH^dr - 

52c. The Method of James and Coolidge. — A decided improvement 
on the calculation of Heitler and London was effected by Wang,^ who 
introduced a single nonlinear parameter X in the wave function and 
was able to reduce the previous discrepancy of 1.58 electron-volts between 
the experimental and theoretical values of the dissociation energy to 
0.96 electron-volt. 

Of the numerous attempts to make still more precise computations 
the most successful is that of James and Coolidge, ^ who have used a 
method of attack resembling that previously used by Hylleraas® on the 
helium atom. These authors, abandoning all attempt to employ atomic 

1 S. Wang, Phys, Rev. 31, 579 (1928). 

* H. M. Jambs and A. S. Coolidge, J. Chem. Phys. 1, 825 (1933), 3, 129 (1935). 

* E. A. Hyllbbaas, ZeitB. f. Phyeik 64, 347 (1929). 



426 PERTURBATIONS WHICH DO NOT INVOLVE THE TIME [Chap. XI 

wave functions as a basis for the treatment of the molecular problem, 
have introduced a system of non-orthogonal base functions involving 
products of powers of the elliptic coordinates of the two electrons and 
of the distance between the electrons. Their approximating wave 
functions were then built up as linear combinations of the base functions, 
and Eqs. (51T8) and (51T9) were employed to determine the beat 
values of the coefficients together with the corresponding energy. Using 
a 13-term wave function and adding a small extrapolated correction for 
additional terms not actually worked out, they obtained a computed 
di.ssociation energy of 4.454 ± 0.013 electron-volts as compared with the 
recent experimental value 4.455 ± 0.008 electron-volts. 



CHAPTER XII 


QUANTUM STATISTICAL MECHANICS AND THE EINSTEIN 
TRANSITION PROBABILITIES 


63. QUANTUM STATISTICAL MECHANICS 


63a. The General Theory of Perturbations Which May Involve the 
Time. — Up to this point we have been concerned almost exclusively with 
problems in which the Hamiltonian operator II is assumed to be inde- 
pendent of time and in which the energy is conserved. In such cases 
we obtain a unique physically admissible solution of the second Schrod- 
inger equation 




h ^ 

27^^ dt 


(53-1) 


from every physically admissible initial function ^o(ir) == iix) by ana- 
lyzing ^ into eigenfunctions of H and setting equal to the expansion 

[c/. Eq. (35-1)] 

* ” + Xj UE)i'i(E,x)e * dE. (53-2) 

n j 

Our perturbation theory has been concerned with the effect of a modifica- 
tion in H on its eigenvalues and eigenfunctions. 

In the present chapter we shall be concerned with problems in which a 
different type of perturbation theory is useful — a type which can be 
applied even when H depends explicitly on the time. Problems of this 
type have been treated by Schrodinger,^ Dirac, ^ Born/ and Slater.^ 
In such cases the system is nonconservative. The expansion (53-2) 
with constant f^s no longer gives a solution of (53-1) if the xf/Js and 
are identified with the instantaneous eigenfunctions and eigenvalues 
of H. It then becomes convenient to study the variation of 
in time with the aid of an expansion in terms of eigenfunctions of a 
constant approximate Hamiltonian Ho, Such an expansion is possible at 
every instant of time ff the eigenfunctions of Ho form a complete set. If 
we identify the functions ypn and the energies En of (53*2) with the eigen- 

* E. ScHRODiNQBR, Ann. d. Physih (4) 81, 109 (1926); 88, 956 (1927). 

* P. A. M. Dirac, Proc. Roy. Soc. A112, 661 (1926). 

« M. Born, ZeUs. f. Physik 40, 167 (1926). Cf. also Born and Jordan, Ekmeniare 
Quanlenmechanikf Sec. 45, Berlin, 1930. 

* J. C. Slater, Proc. Nat. Acad. Set. 18, 7, 104 (1927). 

427 



428 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


functions and eigenvalues, respectively, of Ho, the function 
defined by the equation will become a solution of (53-1) provided that 
the coefficients fn and hitherto treated as constants, are required 

to vary in a suitable manner with the time. The purpose of our present 
theory is to study the dependence of the ^’s on the time by successive 
approximations. The procedure was suggested independently by 
Dirac and others and is often called the Dirac method of the variation 
of constants. It is applicable even when the exact Hamiltonian H is 
time-free. 

From (35*2) or (36*77) it follows that the expansion coefficients 
in, fj(^) define the expectation values of the corresponding eigenvalues 
of the observable Ho for an assemblage of systems in a pure state with 
the wave function These expectation values vary with time as if 

transitions from one eigenvalue of Ho to another were taking place. 
Application of this method to the problem of the interaction betweem 
matter and radiation leads to a quantum-mechanical theory of the 
Einstein transition probabilities for radiative processes. 

In working out the details of the theory we shall use a notation 
appropriate to the case in which Ho has a purely discrete spectrum. As 
previously noted we can always replace the continuous spectrum by a 
dense discrete spectrum if we make a suitable modification of the bound- 
ary conditions or of the natural operator //o; or the theoijy here developed 
can be extended by the introduction of the generalized matrices of 
Sec. 44d. 

As a first step toward the solution of the perturbation problem we 
reduce Eq. (53*1) to a convenient matrix form. Replacing H by 

Ho + XHi 

S —lEnt 

yl^n^ne ^ " for 4^, we obtain 

n 

n m ^ n 

The fonnatioii of the scalar product of each side of this equation with 
}l/k(x) yields 

A; = 1, 2, 3, • • • (53-3) 

n 

~ ■'> 

Let ^(0 denote the vector matrix whose components are $i(0, f2(0> * ' ' 
and let denote the matrix whose typical element is 




Sec. 53] QUANTUM STATISTICAL MECHANICS 

Then the system of equations (53-3) can be written in the form^ 


429 


Integration with respect to t yields 

m-m = (53-5) 

where ^(0) denotes the constant vector matrix made up of the constant 

values of the Ja/s for negative values of t. Let us now expand ^(0 in a 
power series in X: 

m = m+ (53-6) 

where is a vector matrix to be determined. Substituting this 

expansion into Eq. (53-5) and equating the coefficients of like powers of X, 
we obtain 

(53-7) 

or 

(53-8) 


These equations complete the determination of ?(0 and hence of 
thus solving our perturbation problem. It is advantageous, 
however, to restate the theory in a slightly altered form. To this end 
we recall that the unitary operator 


2in, 


tH 


T{t) = C * 

transforms ^{qfi) into has a unitary matrix T{t) which 

transforms the vector {(0) into the vector X(0 of footnote 1. 


1 Let X(i5) denote the vector obtained from ^{i) by multiplying each component 

by the corresponding phase factor e * . Then it is a simple matter to derive the 

equation 


HX * ~ 


h dX 

2irt dl 


which forms an exact matrix parallel to Eq. (53 1). 



430 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


T{t) can be factored into the product of the unitary commuting 

2Trit jj 

operators e ^ ‘’and c ^ \ Hence T(0 can be factored into the 

product of the unitary matrices To(0 and Ti(0 defined by 

2irit 2iriEnt 

To(t)[myn] = ^ Vndr = e ^ 

2wiM , 

T^{t)[m,n] = f^Pn.*e * Vndr. 

We readily verify that 

m = To(o^) = To(t)T^{t)m; 

Since To(0 is unitary, we can multiply by its reciprocal as an antecedent 
factor to obtain 

m = Ti(0^. (53-9) 

Thus the determination of ?(0 is reduced to that of evaluating Ti(0. 

Instead of attempting to evaluate Ti(0 directly from the correspond- 
ing operator we introduce this expression lor ^(0 into Eq. (53*5) and 
obtain 

Ti«)^) = iw - (5310) 

Since this matrix equation should hold for every normalized vector ^(0), 
we conclude that 

Ti«) = I - —JW’Tiindt', . (53-11) 

where I is the unit matrix. To solve for Ti(0 we again develop in power 
series in X, remembering that Ti must reduce to I when X = 0. Then 

oo 

Ti«) = I + (53-12) 

P-1 

if we define by the equation 

Ti(0(»» = (63-13) 

where Ti(<)‘® = 1. ) 

The advantage of this form of the theory is that it pves a convenient 
expression for the so-called “transition probability” from one eigenstate 
of the observable Ho to another. Le* us consider a case in wUch the 



Sec. 53] 


QUANTUM STATISTICAL MECHANICS 


431 


perturbing Hamiltonian Hi vanishes outside the time interval I) < t < B. 
The transition probability from the state to the state xj/m is then 
defined by Born as the probability that the system will be found in the 
state ypm after the application of the perturbation when the initial state 
is definitely known to be In other words is the value of 

when l€n(0)| is unity and the other components of {(0) are zero (cf. 
Sec. 366). Let us simplify the notation by introducing the symbol F 
for Ti(<9). Then 

4>n-^m = \F(m,n)\^ = \Ti{e)[m,n]\^ 

= dmn{l + 

+ F<2)(m,n)*] +•••}+ X2lF<i>(m,n)r- + • • • . (53*14) 

This expression can be reduced still further by means of the relation 

'^Fim,k)Fin,k)* - = 0, (53-15) 

k 

which expresses the unitary character of F. P]xpanding in powers of X 
and equating the coefficient of each power to zero, we obtain 

F^‘‘\m,n) + = ol (53-16) 

In Eq. (53*14) it is evidently legitimate to replace n by m in the coefficient 
of 5mn. Then, using (53*16), we find 

- \^^\F^'Km,k)\^ + • • • ■ +X“|F'»(m,n)p+ • • • . 

(53-17) 

Thus the transition probabilities are determined to terms of the second 
degree in X from the first-degree term of the matrix F = T](^). 

The transition probability ^m-*n is to be distinguished from the Ein- 
stein transition probabilities which govern the statistical averages of 
the quantum jumps of a chaotic assemblage of atomic or molecular 
systems in a natural radiation field. The Einstein transition probabili- 
ties will be discussed in Sec. 54. Here we note that it is not possible 
without proof to compute the probability of a final state for an 
arbitrary initial state simply by summing the products of the various 
|?to( 0)|® values by the corresponding Born transition probabilities. 

63b. The Adiabatic Theorem.^ — ^Another important type of perturba- 
tion problem involving the time is that in which the Hamiltonian H{t) 
does not return to its original form after the time B and in which the total 

^ C/., e, 0 ,y M. Born and V. Fook, ZeiU, /. Phyaik 61 , 165 (1928). 



432 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


change in the time 6 is appreciable. In this case the analysis of the 
perturbed wave functions into eigenfunctions of the unperturbed Hamil- 
tonian is pointless and we have to consider instead the analysis of 
into eigenfunctions of the instantaneous Hamiltonian as given by the 
equation 

F(t)(Pn(Xyt) = En{t)iPn{Xyt), (SS'IS) 

In order to obtain a set of orthogonal eigenfunctions of this equation 
which are uniquely defined except for constant phase-factors it is neces- 
sary to introduce the auxiliary condition (d<pn/dty <pn) = 0 which may be 
compared with (47*10). 

A case of special importance is that in whi(di the operator H(t) varies 
adiabatically, f.c., very slowly with the time. The slow applications of 
external electric and magnetic fields to atomic and molecular systems are 
examples in point. Here the quantum-mechanical extension of Ehren- 
fest\s adiabatic theorem applies. This states that if the Hamiltonian 
changes from the initial form Ho to the final form 11 1 in the time By and 
if the value of B is allowed to approach infinity while the rate at which the 
Hamiltonian changes approaches zero, the transition probability from 
the state m to the state n approaches zero. In other words, solutions 
of (53*18) subject to the above-mentioned auxiliary condition become 

h 

solutions of H{t)^ = — — — as the rate of change of H{t) approaches 

Zirt ot 

zero. In case we have to do with the application of an external field 
there is a gradual transformation of the eigenvalues and eigenfunctions 
of Hy but no forced jumps from one state as defined by (53*18) to another. 
The theorem is proved — with suitable restrictions on H(t) — by Born 
and Fock.^ 

63c. The Fundamental Problems of Quantum Statistical Mechanics. 

As a first application of the general theory of Sec. 53a we shall consider 
the fundamental features of the quantum-mechanical form of statistical 
mechanics. This branch of physical science has to do with the statistical 
properties of assemblages of identical systems not so prepared as to have 
a definite wave function but having a definite temperature. If we 
approach the problem from the standpoint of Pauli^ our treatment takes 
a form which parallels the usual developments of classical statistical 
mechanics and at the same time affords an important illustration of our 
perturbation theory. * 

^Loc. cU, 

2 W. Pauli, Jr., m the Sommerfeld Festschrift, ProhUme der Modernen Phystky 
p. 31, Leipzig, 1928; Handbuch d. Physikj Geiger u. Scheel, 2d ed,, Band XXI V/1, 
Kap. 2, Ziff. 10, 1933. Cf. also P. Jordan Statistische Mechanik auf QvanterUheo- 
rdische Orundlaget Braunschweig, 1933. 



Sec. 53] 


QUANTUM STATISTICAL MECHANICS 


433 


Exporiment shows that the statisti(*al properties^ of a large assemblage 
of independent identi(*al microscopie, or maeroscopie, systems (f.e., a 
Gibbvsian assemblage) which have been ^^aged’^ in a thermostat at a 
definite temperature T for a sufficient length of time Uvsually become 
(constant and ind('pendent of the initial state of the assemblage. The 
ultimate state is then defined to be one of thermodynamic equilibrium 
at the temperature T. By erasing all vestiges of the initial state the 
thermostat a(d-s as a history-destroying device. To be sure there are 
numerous cases in wliich this function is imperfectly performed. In 
such cases the state of true thermodynamic equilibrium, or maximum 
(‘iitropy, is not reached in any measurable time at moderate temperatures. 
We may restrict the discussion for the pr(\sent, however, to systemis for 
which thermodynamic equilibrium is actually attainable. 

If the systems under consideration are of large scale, the dispersion 
of the results for certain typers of measuremient, notably that of energy, 
becomes negligible compared with experimental error when the aging 
procc^ss has been carried to (‘-ompletion. Such quasi-unique properties 
of the individual system in an assemblage in thermal equilibrium are 
often called nornml properties of the assinnblage. In any case^ however, 
there exist other properties which exhibit measurable fluctuations, but 
with distribution functions which have definite forms for states of thermal 
equilibrium. 

From the standpoint of quantum mechani(%s the properties of the most 
general Gibbsian assemblage are postulated to be those of an appropriate 
mixture of pure states. The experimental facts tlum inform us that an 
arbitrary mixtures when subjected to interaction with a thermostat 
for a sufficient length of time approaclu^s a standard state with properties 
uniquely defined by the temperature, the volume of the container, or 
other externally variable parameters. An asstmiblage in the ultimate 
state of thermal equilibrium will be referred to hereafter as a thermostat 
assemblage. The fundamcmtal problems of quantum statistical mechanics 
are (a) the explanation of the approach to thermal equilibrium and (b) 
the development of a detailed descriptioii of the corresponding thermostat 
assemblage. Sohition of these problems paves the way for a quantum- 
mechanical derivation of the laws of thermodynamics and for' a quantum- 
mechanical theory of the properties of matter in bulk. In fact a solution 
of the second problem is a prerequisite to any satisfactory application 
of quantum-mechanical ideas to matter in bulk. In order to test the 
application of the laws of quantum mechanics to any particular type of 
system we must have a method for preparing an assemblage of systems 
of the given type whose statistical state is known in sufficient detail to 
permit predictions regarding measurements we are able to make. In 
1 By ^'statistical proj^ertics ” we mean the totality of the distribution functions 
representing the results of all types of statistical measurement. 



434 


STATISTICAL MECHANICS AND RADIATION [Chap. Xfl 


the case of microscopic systems we can prepare assemblages which bear 
some resemblance to pure states in order to test the theory. On account 
of the appalling complexity of macroscopic systems the preparation of 
approximate pure states is for them out of the question. We are accord- 
ingly driven to the use of thermostat assemblages as the ^ ^ known 
starting point for tests of the theory. 

It is unfortunately impossible to give definite form to the equations of 
motion of a system, or an assemblage, subject to unknown perturbations 
from a thermostatic container. Hence it is customary to approach prob- 
lem (a) by way of a study of the idealized case of an assemblage of isolated 
systems. Fortunately experiment indicates that the tendency toward 
thermodynamic equilibrium exists for approximately isolated systems 
as well as for systems in intimate contact with thermostats. In fact 
it is a familiar extrapolation of experimental results to say that an 
assemblage of isolated large-scale systems started in any state with 
energies restricted to the macroscopically small range between E and 
E + AE will in time reach an apparently constant state experimentally 
indistinguishable from one of thermal equilibrium with a thermostat of 
temperature T appropriate to E. Our first theoretical step will then be 
to verify from the principles of quantum mechanics that the state of such 
an assemblage after a sufficient length of time will become experimentally 
indistinguishable from a constant state the same for all initial conditions 
consistent with the given energy E. This constant asymptotic state is 
used as a first approximation to the desired tliermostat assemblage. To 
get a better approximation we can imagine each of the systems A under 
consideration united with a corresponding thermostat B into a single 
isolated system C. Using an approximate thermostat assemblage for C 
we can work out the statisticjal state of the A systems in contact with 
thermostats. If the heat capacity of the A systems is small compared 
with that of the thermostats, the resulting statistical state is independent 
of the exact energy distribution function adopted for the assemblage of 
supersystems C. Thus the statistical state of the A systems obtained 
by this second approximation in the asymptotic case where the ratio of 
the heat capacities is 1/ can be adopted as a satisfactory model for the 
thermostat assemblage. It is in fact the canonical assemblage of Gibbs. 

63d. The Conventional Characterization of a Chaotic Assemblage. — 
It will be convenient to restrict the discussion for the present to system? 
which can be resolved into a sum of parts {e,g., molecules) whose energies 
are large on the average compared with their mutual energies. Let H 
denote the Hamiltonian of the system as a whole. It will be resolvable 
into the sum of terms Ho and Hi, the former representing the sum of the 
energies , of the individual molecules, and the latter representing the 
mutual energies which come into play classically in collisions. In 
treating Hi as small compared with H we restrict the discussion to 



Sec. 53] 


QUANTUM STATISTICAL MECHANICS 


436 


approximations of the ideal-gas state. In order to take into account 
the finite vohmie of the container an artificial term must be introduced 
into the potential-energy function of each molecule which abruptly 
becomes infinite when that molecule strikes the surface of the vew^el. 
This artificial term takes the place of tlu^ mu(*h more complicated wall 
reaction for the case where the gas and the containing vessel are thought 
of as parts of a single quantum-mechanical system. Owing to the 
finite volume of the container the sj)ectra of Ho and //i will both be 
purely discrete, though very dense. If we postulate “ rough walls, 
the rotation-reflection group will be eliminated from the group of the 
Schrodinger equation even when the containing vessel is macroscopically 
spht^rical. The permutation group yields no degeneracy on account of 
the Pauli princii)le. Hence the eigenvalues of II are physically non- 
degenerate, and we conclude that if the energy of one of the isolated 
systems were exactly known, it would be in a pure state. On the other 
hand the eigenvalues of Ho may be highly degenerate. 

In order to study the statistical properties of mixed assemblages it is 
convenient to make use of the statistical matrix first introduced by von 
Neumann,Ho characterize such an assemblage. Let ^ 1 ,^ 2 , * * * ^n, * • ' 
denote a set of functions which* is complete and orthonormal in the 
configuration space of the system. Let Wg and 'Iq denote the weight and 
wave function of tlie .s*th pure-state^ subassemblage of the mixture under 
consideration. Let denote the Fourier coefficient defined by 

n 

Let a be any observable. Its expectation value (c/. Se(^ 415) is 

« = (53-20) 

« 

Substituting the expansion (53T9), we obtain 

a = %'%a{m,n)%w,c\iy^ , 

n m a 

where a(m,n) is the usual matrix element Let the Hermitian 

matrix p be defined by 

p(n,m.) = “ (53-21) 

a 

Then 

a = '^^a(m,n)p{n,m) — Spur up, (53-22) 

m n 

1 J. VON Neumann, Nachr.y p. 245 (1927); M.Q,Q,y p. 167, C/. also P. A. M. 
Dikac, Proc. Camb. Phil. Soc. 25, 62 (1929). 



436 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


an invariant of a canonical transformation. The probabilities of different 
eigenvalues of a can be calculated with equal simplicity from p (they are 
expectation values of suitably defined linear operators) so that 9 (;har- 
acterizes the entire statistical behavior of the assemblage and is properly 
called its statistical matrix. 

In the classical statistical mechanics of Gibbs an assemblage of 
independent systems is said to be in statistical equilibrium if the expecta- 
tion value of every function of the coordinates and momenta is constant 
in time. We accordingly define statistical equilibrium for a mixed 
assemblage in quantum mechanics by the requirement that 9 shall be 
(constant in time. If we use exact eigenfunctions of the complete Hamil- 
tonian II for the the expansion coefficients c„a take the form 

CS." = "Xn. (!XP i(—^Ent + 

where 7 „, and 6„, are real constants. Introducing the abbreviation 
Vmn for {Em — E„)/h, we have 

p{n,m) = 

ft 

Since the energy levels En an^ nondegeherate, all the off-diagonal elements 
of 9 are periodic functions of the time and it is necessary for statistical 
equilibrium that 9 be a diagonal matrix. Reduction of 9 to this form 
implies either that every component pure-state assemblage shall have a 
unique energy, or that the values of the phase differences dma — 0^ 
shall be uniformly distributed over the range from — tt to +7r so that 

vanishes unless m = n» Thermostat assemblages 

s 

are special cases of statistical equilibrium which we interpret most 
plausibly by the hypothesis of uniform phase distribution. 

Since the average values of the off-diagonal elements of 9 are zero, it 
is evident that the time-average of the expectation value of a for a suffi- 
ciently long time is obtained from (53-22) by neglecting the off-diagonal 
elements of 9 and a. The initial conditions may be such that these off- 
diagonal elements make an important contribution to a at the time 
t = 0 . However, we can be sure that there will be no systematic com- 
mensurabilities among the periods of the off-diagonal elements. Hence 
the distribution of their phases will become chaotic in time. Further- 
more, it is safe to assume that in any initial statistical state which it is 
possible to prepare there will be a large number of off-diagonal elements 
whose absolute values are of the same order of magnitude as their mean. ^ 

1 The Gibbs classical theory indicates that the order of magnitude of the uncer- 
tainty in the energy of a system in thermal equilibrium is equal to the average kinetic 
energy per degree of freedom multiplied by the square root of the total number of 
degrees of freedom. We cannot hope to determine the energy of a macroscopic system 



Sec. 53] 


QUANTUM STATISTICAL MECHANICS 


437 


In general the chaotic distribution of phases among these elements implies 
a (iancellation of (joiitributions to a. From time to time accidental 
correlations of the phases of q and a will lead to momentary deviations 
between the exact value of a and that computed from the diagonalized 
average of g. Similarly the distribution function for the eigenvalues of 
a will fluctuate about an average form given by the diagonal matrix 9 . 
But experimeyitally haphazard deviations in the distribution fum^tion for 
the eigenvalues of a are not to be distinguished from the deviations in 
the measured values of a allowed by the average distribution function. 
Hence we can say that the statistical state of an arbitrary assemblages of 
isolated systems started off in any practical way becom(\s — after a time 
sufficient to insure a chaotic distril>ution of phases — experimentally indis- 
tinguishable from the states of statistical equilibrium whose matrix is p. 

Although use of a matrix scheme for ^ based on an orthonormal system 
of eigenfunctions of the complete Hamiltonian H leatls at once to a 
simple criterion for statistical equilibrium analogous to that given by 
Liouville\s theorem in classical theory, this matrix scheme is not the one 
which makes the closest connec^tion with the language in which statistical 
mechanics is ordinarily discussed. The eigtmfunctions of H are so 
conipli(‘-ated that we can have little idea of their character in detail. 
Thus the evaluation of the matrix of any observable a which could be 
measured would be impossible if a scheme of this type were employed. 
It is therefore conveniemt to replace the functions of Eqs. (53T9) 
by eigenfunctions Un{x) of the approximate Hamiltonian Ho. In place of 
(53*19) we have 

~ ^ (53*23) 

n 

Let 9 ' denote the statistical matrix in the new scheme with the typical 

element If U is the unitary matrix which 

8 

transforms c into I according to the rule ^ = Uc, we have 

p'in,m) = ^^U{n,l)p{lyp)U-^{p,m). (53*24) 

l V 

more accurately. Hence the uncertainty in the total energy of any actual macro- 
scopic system, although small compared with the expectation value, is sure to be 
astronomically large in comparison with the spacing of the energy levels of the 
operators H and Ho. Thus the total number of off-diagonal elements of ^ which can 
have appreciable values is enormous. We have no experimental means of choosing 
initial statistical states which are approximate eigenfunctions of H and on the basis 
of any plausible assumption regarding a priori probabilities it is evident that initial 
states involving anything but a haphazard distribution of absolute values over the 
array of possible elements of p are extremely improbable. Hence we postulate such a 
haphazard distribution. 



438 STATISTICAL MECHANICS AND RADIATION [Chap. XII 

The elements of the matrix 9 ' can be divided into two classes con- 
sisting respectively of those elements connecting different eigenvalue's 
of Ho and those connecting different states with the same eigenvalue 
of Ho. The elements of the former class yic'ld sensibly zero time-averages 
since these elements are in first approximation liarrnonic functions of t 
with frequencies given by the differences between the two correlated 
eigenvalues of Ho, Eh^ments of the latter class, including the diagonal 
elements of p', are in first approximation constant, but undergo secailar 
changes in time and do not give zero time-averages. Nevertheless it is 
customary in statistical mechanics to neglect all off-diagonal elements 
of p'. The discarding of off-diagonal elements wdiich do not average to 
zero in time is justified by the observation that in the absence of a cor- 
relation between tlu^ average phases of these (^h'ments and tlu^ phase of 
the matrix of any measurable physical quantity a the contribution of 
these (dements tef the expectation value of a or any function of a will be 
small. Assfimblages for which it is ordinarily legitimate to neglect the 
off-diagonal elements of p' are conveniently labeled chaotic. 

The abov(^ approximation is of importance because; of the connection it 
makei=? with the conc(;ption of the stationary state characteristic of th(' 
Bohr theeny. In that theory it was assum(;d that every molecule of a 
gas must spend pra(;tii;ally all its time in one or another of the discrete 
(luantizc^d singl(?-enorgy states which in our pr(;sent theory w^e should 
correlate with the members of a system of o’ thonorrnal eigonfuruitions of 
the Hamiltonian of the individual mok^cule. Transfers from one state 
to another were assumed to take place discontinuously as a result of 
collisions and intc'raction with the ever-present radiation fiedd. The 
primary problem of statistical theory was then to determine the average 
number of molecules in each individual molecule energy level for an 
average sample of gas at a temperature T 

The picture behind such a computation is inadmissible from our 
present point of view^ The allowc^d ‘‘microscopic states of the complete 
sample of gas in the Bohr theory are defined by particular distributions 
of the molecules among the allowed states for an individual molecule. 
The analogues of these states of the macros(;opic systenn in quantum 
mechanics are pure states of a special type whose wave functions are the 
products of eigenfunctions of the single-molecule Hamiltonian, or pure 
states formed from such products by a symmetrization process in accord- 
ance with the Pauli principle. These states can be correlated with the 
functions wi, • • • t^n, * * * of (53*23). Hence the analogue of the 
Bohr concept of a Gibbsian assemblage of independent identical macro- 
scopic samples of gas is a mixture of pure states whose wave functions 
are all members of the sequence wi, * • * Wn, • • • . The statistical 
matrix p' for such a mixture would be diagonal. It would not preserve 



Sbc. 53] 


QUANTUM STATISTICAL MECHANICS 


439 


this form for any finite^ period of time, however, since the assumed wave 
functions are not solutions of the appropriate Schrodinger equation, 

* %ri m 

Nevertheless, it is convenient to correlate the state of a macroscopic 
system as defined by actual experimental conditions with a corresponding 
mixture of the above type suggested by the Bohr picture;. This simply 
m(;ans that we negk'ct the off-diagonal elements of the; p' already shown 
to be small in the gr(;at majority of cases. No approximation is involved 
in dis(;arding these off-diagonal elenieiits when we are computing the 
expectation value of a dynamical variable a which commutes with Ih 
and whose matrix in the Un scheme is itself diagonal. There is an 
approximation, however, in the case of a variable, like; the external 
pressure, which does not commute with //o. Consider, for example, 
an assemblage of identical gas(;ous systems in which the density dis- 
tribution is very non-uniform. In order to calculate the density dis- 
tribution from the diagonal elements of p' we have to calculate the 
density distribution corresponding to each of the orthonormal functions 
Un, multiply by tlie corresponding diagonal element of p', and add. Each 
of th(; terms will (;oritribute a roughly uniform density and so the density 
comput;(;d in this way must be uniform. It follows at once that at the 
moment in question the diagonal elements of p' will not tell the truth, 
i.e.y the nondiagonal elements must play an important part. Such 
states ar(;, however, (;ssentially ephemeral. 

SiiK^e the off-diagonal elements of p' are to be neglected, it will be 
convenient in the; remainder of this discussion to adopt the language of 
the older quantum statistics, interpreting each of the diagonal elements 
of 9' as the probability that an arbitrary member of the assemblage is 
in the state with the wave function Un. Let N denote the total number 
of systems in the whole assemblage and let pn denote the diagonal 
element p\njn). Then Nn = Npn can be interpreted as the population 
of the state Un- 

63e. Transition Probabilities and Statistical Equilibrium for Chaotic 
Assemblages. — Before carrying the argument farther it will be well to 
pause in order to formulate the fundamental assumption of most work 
in the application of statistical mechanics to special problems.^ It is 
necessary to mention the existence of certain approximate integrals 
of the Hamiltonian H which are usually exact integrals of //o and which 
may be used to divide the wave functions Ui, U2, * * * into approximately 
independent classes which are not mixed to any appreciable degree by 
the action of the perturbation Hamiltonian Hi in the time required for 

^ Cf,y e.g.y R. H. Fowler, SicUistical Mechanics^ 2d ed., §1*4, Cambridge and New 
York, 1936. 



440 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


the attainment of approximate statistical equilibrium within each class. ^ 
In such cases the sum of the diagonal elements of p' associated with each 
independent class — and therefore the total population of the class — is 
sensibly constant for the duration of many experiments in which quasi- 
equilibrium is reached, and the relative values of these sums are fixed 
by the initial conditions and do not correspond to rigorous statistical or 
thermodynamic equilibrium. The fundamental assumption mentioned 
above is now that a satisfactory model thermostat assemblage for a 
macroscopic system in approximate thermodynamic equilibrium is one 
with a constant diagonal matrix whose nonvanishing diagonal elements 
are associated with a single energy level En of the approximate Hamil- 
tonian //o, all populations associated with any one of the inde^pendent 
classes being equal. 

This assumption gives correct answers to the problems to which it is 
customarily applied, but would not be satisfactory for all imaginable 
purposes since the energy of a macroscopic system is never uniquely 
defined in the case of a statistical state which is experimentally realizablc‘. 
We cannot prove that the matrix p' of an arbitrary ass(^mblag(^ of isolated 
identical macroscopic systcmis with an initial (‘iiergy uncertainty AE takes 
the above form for large values of but we can prov(' a proi)osition which 
for practical purposes is equivalent to that. To this (^nd w(^ restrict the 
discussion to the case in which there is a single independc'nt class of wave 
functions and divide these functions into groups. Each group consists 
of wave functions Un associated with energies ranging from E to E + AE^ 
but with a single set of eigenvalues of a suitably chosen set of mutually 
compatible integrals of the unperturbed motion. Each of these groups 
has a total population easily derivable from j>' by summing corresponding 
diagonal elements. It is easier to study the secular changes in the total 
populations of these groups than it is to study the chaJiges in the popula- 
tion of the individual substati‘s. For tnis purpose we shall set up dif- 
ferential equations for the time rate of chajtge of the group populations 
from which it follows that for large values of t the average population 
of the substates in one group must become equal to the average popula- 
tion of the substates in any other. But the different substates in any one 
group can be reckoned physically indistinguishable so that equality in 
the group-average populations of the substates is for physical purposes 

1 For example, in the case of an experimental measurement of the specific heat of 
Hi at low temperatures the transformation of ortbohydrogen molecules into para- 
hydrogen molecules and vice versa takes place so slowly that we may assume statistical 
equilibrium with respect to all other types of transition and at the same time treat the 
relative amounts of orthohydrogen and parahydrogen as constants of the motion 
fixed once and for all by the initial conditions. Under these circumstances the 
different values of the ratio of orthohydrogen to parahydrogen divide the wave 
functions into independent classes which we regard as immiscible. 



Sec. 53] QUANTUM STATISTICAL MECHANICS 441 

equivalent to equality in the individual populations of all substates. 
The method is that of Dirac^ and Pauli (loc. ciL), 

The instantaneous time derivative of any diagonal element of g' is 
given by 



s 


and Eq. (53-3). This instantaneous derivative, however, involves high- 
frequeney terms whi(di cause rapid oscillations in dpj /dt about a mean 
value which gives the secular change in pj. We shall be interested in 
the mean, or secular, derivatives of the diagonal elements and these 
are most conveniently computed with the aid of the perturbation theory 
of Sec. 53a. 

In place of (53 T 7) we have 

iu(oi= = + XS { F{m,n)F{m,n') *fn«(0) fn'«(0) * 

+ F{m,,7i) ^F(m,n') $„,(()) } (53*25) 

for the case of an arbitrary initial wave function ^s(O). In order to 
make progress by our chosen method of attack it is necessary to eliminate 
the double sum and with it the high-frequency contributions to |fm«(OI^- 
The elements of this double sum are real quantities which can equally 
well be positive or negative. They will yield a negligibly small con- 
tribution if there are many equally important nonvanishing terms in 
the sum with a haphazard distribution of phases. It is not difficult to 
see that this condition should be satisfied by the great majority of the 
possible choices of the coefficients f„«(0) describing pure initial states 
consistent with the condition that a measurement of the total energy 
must be certain to yield a value in the basic interval AE, It follows that 
if we multiply the above equation by Wg and sum over all subassemblages 
of a mixture consistent with the same condition, the contribution of the 
double sum to the result will nearly always be negligible. Hence in the 
case of any mixed assemblage which has a (daim to consideration as a^ 
possible model thermostat assemblage, 

N„(t) = Np„'{t) = NXw.lum = (53-26) 

8 n 

From (53*13) we have 

F”) s Tx(<)”> = 

» P. A. M. Dibac, Proc. Roy. Soc. A114, 243 (1927). 



442 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


Thus 


i(w,n)c * = 

if Em 9^ En, and 


En — Em^ 


x] 


if Em == En> Neglecting higher order terms in (53*17) and setting \ equal 
to unity, we can tlirow (53*26) into the form 


Nm{t) = Nm{Q) + 2 - iV»(0)} 

n 

= JV„(0) + {Ar„(0) - iV„(0)|. (53-27) 


Here Vnm is again used to denote {En — Em)/h and denotes a sum over 

n 

all values of n not equal to m. 

In order to derive a differential equation from (53*27) it is necessary 
to approximate the sum which it contains by means of an integral. To 
this end we must examine the matrix elements Hi(m,w) in some detail. 
The perturbing Hamiltonian operator E\ gives the mutual potential 
energy of the molecules. It will accordingly be made up of a sum of 
terms each of which involves the coordinates of two molecules only. 
The wave functions Wi, 1^2, * • • will be linear combinations of products 
of individual molecule functions chosen from an orthonormal set ^1, 
v?2, * * • . These linear combinations will be so selected as to accord 
with the symmetry requirements of the Pauli principle. Every complete 
function Un is defined by a set of non-negative integers * • • 

which determine the number of molecules assigned to each of the <^^s. 

The a^”^^s are, of course, subject to the restriction The 

r 

matrix elements of E\ between two states with wave functions and Um 
can be shown to vanish unless the two corresponding sets of integers aj”' 
and are identical except in the case of four values of r, say a, /3, 7, 6. 
The detailed proof will be supplied by the reader without difficulty after 
he has examined Secs. 63 and 64. Let F(l,2) denote the part of E\ which 
gives the mutual energy of molecules 1 and 2. Each nonvanishing 
matrix element Ei{m,n) turns out to be a multiple of an integral of the 
form JF(l,2)^*(l)^/s*(2)v?7(l)^a(2)dridr2 where dri is an element of 
volume in the configuration space of molecule 1 and dr% is a similar element 
for naolecule 2 and 

« aj?’ + 1; + 1; — 1; =* af'^ — 1. 



Sec. 63] 


QUANTUM STATISTICAL MECHANICS 


443 


Let us now replace the single index n of the wave function Wn, with a 
pair of indices, say n and with the following properties: n is to fix 
the translational energy state of one molecule, while v determines the 
complete state of all other molecules and simultaneously the internal 
state of the one. A group of wave functions Unv with a single value of v 
and a variety of choices of n can be reckoned macroscopically equivalent. 
With the double index notation the typical matrix element of Hi becomes 


n,v) - /F(l,2)^«*(l)^^*(2)v^,(l)^5(2)dridr2. 


If we hold m, /z, v fast and vary n, we change only the translational factor 
of one of the two functions <^7(1), ^«( 2 ). In order that n^v) shall 

be large it is necessary that real or imaginary parts of the product 
shall be predominantly of one sign in that portion 
of the configuration space of the molecules 1 and 2 where F(l,2) is largo. 
In other words there must be a constructive interference in configuration 
space which becomes less and less possible as the total energy of the state' 
^7(l)^a(2) diverges more and more from that of the state (pa{\)ipp{ 2 ). 
Hence the average absolute value of the matrix elements 7?,v), 

other things being equal, must be a function of Emti — Env with a maxi- 
mum when EmiA — Enp is zero. 

Let us now^ introduce the quantity J whw;h we define as the value of 




n^v)\^ when the sum is extended over all values of n cor- 


n 

responding to energies Enp in the range e — < Enp < e + j^Ae. 

We assume that we can choose A6 large enough so that the sum includes 
many terms and small enough so that J changes by a small fraction of 
itself when € is increased by Ac. In accordance with the foregoing dis- 
cussion we postulate that J is a symmetric function of e — En^ which 
we write as The range, say De, of values of e which con- 

tribute appreciably to 


Jjnp{€ - Emij)d€ = n,v)\^ 

n 


is difficult to estimate but can hardly be of order of magnitude greater 
than that of the average translational energy of one molecule. 

A similar average value is defined by 

m in At 

where the sum is to be extended to all values of m corresponding to 
energies Emft in the interval c — ^^Ac < Enm < c + 



444 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


We rewrite Eq. (53*27) in the double-subscript notation and sum over 
all values of m for which E < Emu < E + AE, Introducing the abbre- 
viation = 2 ^ Nmfjiit), we obtain 

m in AE 


iyr,«)-Ar,(o) = -4 2 


m in AE 


+ 4 


A^n.(0)| 2 - E^)\- (53-28) 


m in AE 


The two quantities in braces can be replaced by integrals. Thus 






J Hvi^^ E tn/J^ • ‘>^^/ 7T» \ A 


= xJ-'-CD't*'''- 


The familiar function sin^f/f^ consists of a series of arches between the 
zeros at ± mir. The area under the dominant central arch is over 90 per 
cent of the area under the complete graph of the function. Hence 
we can negle(5t the other arches in first approximation and can treat 
JuvOii/iri) as a constant equal to provided that ^ fe/De so that 

the quantity \hi/'Et\ is much less than De when [{j < ir. Under these 
circumstances the quantity Jm*'( 0 ) can be taken out from under the 
integral sign and the integral evaluated. The sum reduces to 47 r /h. 

The second expression in braces in (53*28) requires slightly different 
treatment since we have to sum only over values of m in the interval AE. 
We find 




min AE 


= jK,.{Q) 


X- 




where 7 is the range of values of the quantity 

? = ^ E,.,) 


which places € in the range AE. In other words 7 denotes the interval 
j{E - Enr) < ^ <j{E + AE - En.). 



Sec. 63] 


QUANTUM STATISTICAL MECHANICS 


445 


The ratio of the width of the central arch of width of y is 

2h/tAE, We assume that for the values of t and AE under consideration, 


this quantity is much less than unity. Then the function 


Jy i 


2 df will 


sin^f 

fy i 

approximate tt for nearly all values Env well inside and will approxi- 
mate 0 for nearly all values of Enp outside 7 . The sum we are trying to 
evaluate is to be interpreted as a multiple of the transition probability 
from the state Unv to the group of states Umn for which Emu hi AE. 
Since this transition probability is sensibly 0 unless Enp is also in AEy we 
conclude that transitions only take place between states of essentially 
the same energy. 

Equation (53*28) now becomes 


N^(t) - N,{0) _ 


t 




m in AE 


v n in AS 


= - iV,(0)JM.(0)]. 


(53-29) 


We denote the quantities and -^' J^p{0) by Tp^^ and 

respectively. They are transition probabilities per unit time, like the 
Einstein transition probabilities, but have their origin in collisions 
between molecules rather than in the interaction of matter and radiation. 
With this notation (53*29) yields the set of differential equations 

~ (63-30) 


for the group populations. The particular method which wo have used 
for specifying the groups is evidently not the only possible one. 

These transition probabilities are not independent. Let denote 
the number of states Umn per unit increment in the energy Emti- Let 
Gp denote the corresponding number of states Unp. These play the parts 
of statistical weights for the two sets of states. Then 

Em in At En in At 

= GpTp-.^. (53*31) 

Let w^j Wp denote the relative populations per state of the groups m and v 
defined as N^/G„ and Np/Gp, respectively. Equations (53*30) and (53*31) 
yield 


(53-32) 



446 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


From this equation it is clear that groups with more than average popu- 
lation per state must lose i)opulation to those with less than average 
population. Thus we have a tendency toward an ultimate statistical 
equilibrium in which the average 'population per state of each group is equal 
to that of every other group. If we reckon the various subjective states 
in any one group as experimentally indistinguishable, we may say that 
the assemblage as a whole makes an asymptotic approach to a statistical 
state which is experimentally indistinguishable from one in which all 
subje(5tive states of the same energy are equally populated. We are 
accordingly at liberty to use as a first approximation model thermostat 
assemblage one with a constant diagonal statistical matrix j)' whose 
non vanishing elements are restricted to a narrow energy range AE 
and have values dependent only on their energies. Allowing AE to 
shrink to zero, thus permitting only one energy value, we obtain the 
model used by Fowler.^ 

53f. The Gibbs Canonical Assemblage for Systems of the Most 
General Type. — Tlie weakness of the foregoing derivation of (53-32) 
is that it is based on the use of first-ord(^r ptirturbation ap])roximations 
for the calculation of increments in w^ during time intervals which 
must be large compared with h/Dt. The procedure should be satis- 
factory provided that the fractional change in for the time interval 
h/Dt is itself small compared with unity. The most difficult conditions 
to meet are those in which dw^/dt is a maximum, or in wdiich all the 
Wp^s in (53-32) are set equal to zero. The indicated fractional initial 

decrease in ic#* in the time h/De is then approximately 


Sr 




is the average total transition probability from states of the y 


group to all other groups per unit of time. To estimate its value we 
note that classically every collision V)etween two molecules would throw 

the system out of its initial state into another state. Hence 

can be identified with the total number of collisions per unit time in the 
sample of gas. 

If I is the mean free path, v is the average velocity and N the total 
number of molecules in the sample of gas which constitutes the system, 
the condition for the validity of (53-32) with all Wp^s set equal to zero is 



A Al 

Dt 21 


« 


1 . 


Taking De equal to the average translational kinetic energy per molecule, 
as previously suggested, we can reduce the above inequality to 
* jLoc. cit. 



Sec. 631 


QUANTUM STATISTICAL MECHANICS 


447 


where X is the average de Broglie wave length. 

In the asymptotic case in which the volume of the gas is made 
approach infinity, while N is held constant, the required conditions can 
be met and hence we can be assured that (58*32) holds for all in the 
case of an absolutely perfe(*-t gas. We must apply it with circumspection 
to actual samples of gas, however, for under standard cjonditions the 
value of N\/l for one c.ubic centimeter of oxygem is of the order of 10^®. 
In the case of initial states close to statistical equilibrium conditions 
are more favorable, but it is evident that (53*32) does not afford an 
appropriate "t)asis for the calculation of the approach to statistical 
equilibrium under ordinary conditions.' 

In view of this conclusion special interest atta(*hes to the problem of 
generalizing our discussion of thermostat assemblages to in(;lude systems 
which are not ideal gascis. This can be done by moans of a reinterpreta- 
tion of the analysis of Sec. 53c in which we id 9 ntify the “molecule” 
of that discaission with a complete macroscopic sample of matter, say aS, of 
any type, gaseous or otherwise, and the “system” with a collection 0 
of identical samples S\, * * * » each in its own container, but with 

weak interactions between eac^h sampler and its neighbors through the 
separating walls. We identify Ih with the sum of tlu^ exact Hamiltonians 
of the separated samples and Hi with the interaction energy. Hi can 
be taken as small as desired, since it is an artificial element in the picture 
and not an intrinsic property of the sample. 

Let us now apply the above analysis to a Gibbsian superassemblagc A 
of identical independent systems ®i, 02, • • • . By making H\ suffi- 
ciently small we can avoid the time-interval difficulty and obtain the 
result that a satisfactory thermostat model for A is one for which all 
diagonal elements of p' vanish excex)t those associated with a single 
energy level of 7/o, say and whose nonvanishing diagonal elements 
are all equal. This is the Fowler model. 

Following the method of Darwin and Fowler it is then possible to 
show that if €r denotes an energy level of the individual system S the 
statistical average value of the number of systems S in the level er for the 
model thermostat assemblage A is 





n 


(53*33) 


' This should not be surprising since the well-known extreme improbability ot 
appreciable departures from statistical equilibrium implies that when such departures 
do occur they must be followed by a return to equilibrium which takes place in a very 
small interval of time. 



448 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


The proof is given by Fowler,^ and need not be reproduced here. The 
number dr is, however, by definition a diagonal element of the sta- 
tistical matrix g for an assemblage of systems S taken from a ther- 
mostat at temperature T, The assemblage defined by (53*33) is called a 
Gihhs canonical assemblage. This assemblage is uniquely determined 
and free from the arbitrariness characteristic of other model thermostat 
assemblages. It has the property that two assemblages of this kind 
are not disturbed if interactions are allowed between each member of 
one assemblage and a corresponding member of the other. 

Finally, the canonical assemblage is the only model thermostat 
assemblage which can be used equally well for microscopic and macro- 
scopic systems. Thus the canonical assemblage may be regarded as the 
fundamentally correct one even though for (convenience in the discussion 
of macroscopic systems the Fowler model is often preferabhc. 

Equation (53*33) specifies the diagonal matrix p for a thermostat 
assemblage of systems S with nondegenerate energy levels. If S is 
composed of many like molecules with small interacction (energy, one 
readily verifies that the same formula can be used for the diagonal 
elements of the corresponding matrix if we identify r with the general 
ordinal number for a sequen(*e of orthonormal eigenfunctions of the 
approximate Hamiltonian and €r with the corresponding energy. 


64. THE ABSORPTION AND EMISSION OF RADIATION: PERTURBATION 
OF AN ATOMIC SYSTEM BY A CLASSICAL RADIATION FIELD 


54a. The Einstein Derivation of the Planck Radiation Formula. — 

Before taking up the quantum-mechanical theory of the interaction of 
matter and radiation we pause to give a brief a(ccount of Elinstein^s 
early “derivation” of the Planck black-body radiation formula.^ This 
derivation was in fact a fitting together of the Bohr theory, statistical 
mechanics, and the radiation formula with the aid of suitable auxiliary 
hypotheses regarding the frequency of the radiative pr()(;esses. 

Einstein assumed the Bohr postulates a and b of Sec. 46a and took 
over from the quantum statistical mechanics of the day (1917) the follow- 
ing formula for the average number of atoms (or molecules) Nn in the 
nth energy level of the Hamiltonian for the internal coordinates in the 
case of a thermostat assemblage of samples of gas at the temperature T: 

En 


Nn 


Ngne ^ 


(54*1) 


This formula is readily derived from (53*33). gn denotes the statistical 
weight of the energy level, usually identified with the number of different 

1 Loc, cit.y §2*32, see footnote 1, p. 439. 

• A. Einstein, he. cU.^ footnote 2, p. 378. 



Sec. 64) 


ABSORPTION AND EMISSION OF RADIATION 


449 


^sets of quantum numbers consistent (under a given scheme of quan- 
tization) with the energy En- From our present point of view gn is to 
be identified with the number of linearly independent physically admissi- 
ble eigenfunctions associated with En. If the energy levels are non- 
degenerate, or if the index n refers to a uniquely defined stationary 
state with a definite complete set of quantum numbers, the statistical 
weights are equal to unity. 

The jumps from one energy level to another which Bohr assumed 
to accompany the spontaneous emission of radiation are analogous to 
the spontaneous disintegrations of radioactivity. Hence Einstein 
supposed that the radiative transitions of free atoms are governed by a 
law of probability similar- to that postulated in the elementary theory 
of radioaertivity. Specifically, he assumed that the probability that 
an atom in the energy level En' will have a transition to any definite 
lower energy level En>> in the time dt is An'-^n"dty where A nf — IS a con- 
stant ciiaract(Tistic of the pair of levels called the (Einstein) transition 
probability for the type of jump under consideration. If at any time 
there are Nn> atoms in the level En'j the number of such transitions per 
second is Nn'An^^n>>. Neglecting the Doppler effect due to the transla- 
tional motions of the atoms, these transitions yield an emission of 
Nn'An'-*n"hvn'n" cfgs pci’ secoiid of moiiocdiromatic radiation of frequency 

Vn'n" = (En' — En'd/h. 

In order to take into account the absorption as well as the emission of 
radiation, the theory postulated a corresponding transition probability for 
atoms jumping upward from the energy level En" to the level En' in a 
radiation field. In accordance with experiment the probability of this 
second type of transitions was assumed to be proportional to the energy 
density of the radiation field per unit frequency interval at the frequency 
Vn'n"- Bet p(vn'n") dcuotc tlus quantity. Einstein assumed that the 
number of transitions of this second type per second is Nn"Bn"->n'p(vn'r>")i 
where Bn"~^n' is a second transition probability. 

Still a third type of transition, viz.y emission from the energy level 
En' stimulated by the radiation field and proportional to p(vn'n") was 
needed in order to account for the desired Planck formula. The existence 
of such stimulated emission was foreshadowed by the classical theory 
of the interaction between an oscillating system of charges and a classical 
radiation field. ^ The probable number of such emission processes per 
second was set equal to Nn'Bn'-^n"p(vn'n")j where Bn'-^n" is a third transi- 
tion probability. 

In thermal equilibrium, emission and absorption must balance so that 
Nn'\,An'—*n" “f” Bn' — >n"p(^n'n"^^ N v-"Bn"‘^n'p(^n'n"') ' 

Inserting the values of Nn'j Nn" and simplifying, y^e obtain 
e.g.y J. H. Van Vlbck, Phys. Rev, 24, 330 (1924). 



450 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


hvn' n" 

gn'[An^^n" + Bn>^n"p(,Vn^n")] = Qn^C B p{v n' n”) > 

Hence 



In order to deduce the Planck formula 


p{v) = 


f,kT _ I 


Einstein had only to assume the relationships 

gn"Bn"—>n» ^ Qn'Bn* — 

t ^whVfu'n'' u 

An'-*n" = S lSn>~*n^r 


(54*2) 


(54-3) 


(54-4) 

(54*5) 


With the aid of these formulas it was possible to derive the values of 
Bn'^n" and Bn"-^n' hom the corrcAsponding value of An'-*n" when the last 
mentioned transition probability had been worked out from the Bohr 
correspondence principle. Although thermal equilibrium must be 
assumed in order to obtain the radiation formula, the postulate that 
atoms jump from the upper level to the lower at a net rate 

Nn'[An*—*n" “f" (54*0) 


is assumed to hold outside thermal equilibrium for any natural^’ 
(chaotic) radiation field. 

64b. Elementary Approaches to the Quantum Theory of the Einstein 
Transition Probabilities. — As noted in Sec. 53d Bohr^s basic assumption 
that every atom or molecule spends practically all its time in one or 
another of the stationary definite energy states with intermittent dis- 
continuous jumps from one to the other does not hold in quantum 
mechanics. Only in a very special class of subjective states is an atom 
assigned to one definite energy level. Nevertheless it is possible to 
give a satisfactory reinterpretation of the concepts which enter his 
formulas and to maintain that the formulas are still correct. *A complete 
quantum theory of the emission and absorption of radiation derives the 
Einstein formulas and the values of the transition probabilities which 
appear in them. Such a theory was first given by Dirac^ following an 
argument similar to that of Sec. 53e, but with terms in the Hamiltonian 
for the energy of the radiation field and the mutual energy of matter and 
field. To give an account of the Dirac theory would lead us beyond the 
scope of this book. Hence we shall content ourselves with the lesser 

^ P. A. M. Dibac, loci cU,; E. Fbbmi, Rev. Mod. Phya. 4, 87 (1932); G. Bbbit, Rev. 
Mod. Phys. 4, 504 (1932). 



Sec. 54] 


ABSORPTION AND EMISSION OF RADIATION 


461 


problem of setting up formulas for the basic constants 
Bn*' — 

This problem can be dealt with in a variety of ways involving different 
degrees of sophistication. Schrodinger\s original “hydrodynamicar^ 
method^ treated the atom as a continuous charge distribution to which 
the classical radiation formulas could be directly applied. The charge 
density p{x^y^zfy was computed from the wave function ^(xi, * * * ’ 2 :/, 0 
as we should now compute the instantaneous statistical average charge 
density. The procedure is out of harmony with the present inter- 
pretation of the formalism of wave mechanics and gives results which 
are only partly right, since the computed rate of spontaneous emission 
depends on the “population^’ of the lower energy levels into which the 
atoms drop as they emit. In fact the computed rate of spontaneous 
emission falls to zero if all the atoms of the assemblage described by a 
wave function are confined to a single-energy level. 

In the Heisenberg matrix theory th(^ correct formulas for the Einstein 
transition probabilities are obtained directly from the Bohr correspond- 
emce principle, f.c., as ad hoc postulates based on classical analogy. 
Thus the rate of spontaneous emission per atom due to transitions from 
one nondegenerate state to another given by (46-8) is to be identified 
with the quantity hvr,^n"An'-^n"- The transition probability is accordingly 

(54-7) 

where D is the Heisenberg matrix associated with the electric moment of 
the atom, D = '^eicVk. In order to work out the corresponding Einstein 

k 

transition probabilities for degenerate energy levels we proceed as 
follows. We assume that we have to do with an assemblage in which 
the population Nn'm'j of a substate of En'y defined by the wave function 
^l/n'm'y is independent of m'. (This is in accordance with the usual rules 
of elementary quantum statistical mechanics for thermal equilibrium.) 
Then the probability that an arbitrary system in the level En* will 
make a transition to En" in unit time is the average of the corresponding 
probabilities for the various initial substates. The probability for the 

substate defined by is Taking 

the average over ttie gn* substates of the upper energy level we obtain 


Af' — — 


64 ir*Vn' n"^ 
dhc% 


^ (54-8) 


^ E. SchrOdingbb, Ann, d. Physik 81 , 134 (1926). Cf, also A. Sommerfeld, 
A,S.W,E,, p. 66, and Condon and Morse, Q.ilf., p. 90. 



452 


STATISTICAL MECHANICS AND RADIATION [Chap. XJI 


From Eqs. (54*5) and (54*4) we learn that the corresponding values 
6f the other two Einstein transition probabilities are 








(54*9) 
(54- 10) 


These results are in harmony with experiment and with the conclusions 
to be obtained from more complete theories. 

A third method of attack on the problem of emission and absorption, 
independently devised by Dirac, Born, and Slater, is to deal with absorp- 
tion and stimulated emission by a direct application of perturbation theory 
to the problem of the interaction of a quantuni-me(?hani(jal system with a 
classical radiation field. ^ Using this procedure we obtain the formulas 
(54*9) and (54*10) as a consequence of principles already laid down with- 
out recourse to analogy. Unfortunately it is impossible to give a direct 
quantum-mechanical treatment of the problem of spontaneous emission 
except by applying the same quantum principles to the radiation field 
which we have developed for atomic systems.^ Hence the procedure 
under consideration has to lean on the Einstein theory for a derivation 
of from 

The remainder of Sec. 54 is devoted to an exposition of the method of 
Dirac, Born, and Slater. 

64c. The Perturbing Hamiltonian for a Classical Radiation Field. — 

The Hamiltonian operator for an atomic system interacting with a 
classical radiation field with the scalar potential ^{x,y^z) and the vector 

potential Ci{x,y,z) was derived in Sec. 7, p. 28. The reduced form of this 
Hamiltonian [cf. Eqs. (7*11) and (49*2)] is 


where 



1 C/. P.'A.M. Dibac, Proc. Roy. Soc. A112, 673/ (1926); M. Born, ZetYa, /. Physik 
40, 167 (1926); J. C. Slater, Proc. Nat. Acad. Sci. 18, 7 (1927). 

*C/. footnote 1, p. 450. 



Sec. 54] 


ABSORPTION AND EMISSION OF RADIATION 


453 


It can be easily verified that this Hamiltonian is Hermitian with respect to class D 
and th(^ domain of all Cartesian -coordinate space. This was almost proved for the 
three-dimensional case in Sec. 8, where we showed that 

J* ^ ^^*dxdydz = 0, 

provided that is a suitably restricted solution of the Sehrodinger e(iuation 

= -{h/2Tri)d>^/dt, 

with H defined by P^q. (54-11), n being set equal to unity. The reader will readily 
verify that tlie conditions for the theorem there proved are fulfilled if is of class D. 
To prove the Hermitian character of H, we replace 'P by -}- 4'2 and -f in 
turn, where SPi and 4^2 are any two class D solutions of the differential equation. It 
follows that 

and 

f ” ^i^i*)dxdydz = 0. 

Hence 

~J* J* J* '^i^-i^dxdydz = J* J* J*('4'2*II^i — ^ \H^ 2 *)dxdydz = 0. 

Siiuu', we can (dioose any class D function as a possible instantaneous form for a solution 
of the Sclirodinger equation, it follows that 

holds for any pair of class D functions, or that H is Hermitian with respect to class D. 
The extension of the theorem to the case of a many-particle problem is left to the 
reader. 

Let Th denote the Hamiltonian for the atomic system in the absence 
of a radiation field. As in Sec. 53 we assume an arbitrary subjective 
state and an expansion in terms of a normal orthogonal set of 

discrete eigenfunctions ypn{x) of //o. Thus 

2rriBnt 

(54-13) 

n n 

We shall compute the variation in theVoefficients Jr. with time due to the 
radiation field. 

Our purpose is to determine the rate of change in the relative proba- 
bility of the different states as affected by interaction with a ^^naturaT' 
radiation field of given frequency distribution (as given by experimental 
spectrum analysis). In making such a computation we calculate the 
perturbation in a time interval short emough to allow us to treat the 
radiation (mergy density per unit frequency interval as constant and 
the changes in the fn\s as infinitesimal's. It is important to note, however, 
that the time interval under consideration, while small from a macroscopic 



454 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


point of view, is best made large from a microscopic point of view. 
Otherwise we should get into difficulty in relating the convenient mathe- 
matical description of the field with the empirical spectrum analysis 
(c/. pp. 457 and 458). 

Let us deal first with the elementary case of a radiation field consisting 
solely of plane waves moving in the direction of the x axis and polarized 
in the x^z plane, ^.c., with the electric vector parallel to the y axis. In 
the case of a plane wave system the scalar potential ^ can be set equal to 
zero, in which case the vector potential is purely transverse and parallel 
to the electric vector. Let the interval during which the perturbing field 
acts at the origin be 0 < ^ < 0. As we are concerned only with the values 
of the vector potential during this interval we can represent the potential 
by a Fourier integral. Thus 

G'v = c ) 6Lx = (iz — 0, (54T4) 

=: \ r <i>y{v)e ^ 8a: = 8^ == 0. (54-15) 

e ot J - « 

Here X is a real parameter which determines the amplitude of the wave 
system and which will be identified with the parameter X of the g(*iieral 
perturbation theory of Sec. 53. As f{t — x/c) is real, the amplitude 
function for the electric force is subject to the relation 

4>y{v) == 


Dropping the scalar potential 4> from (54-11) and neglecting the terms 
in F', we reduce the Hamiltonian to the standard form 

3 

Ho + X//i, with 



3 


Equation (54-14) gives the further reduction 



3 


(54-16) 


(54-17) 


64d, The Bom Transition Probability. — The first step in carrying 
through the perturbation scheme of Sec. 53 is to evaluate the elements 
of the Heisenberg matrix A typical element has the form 


HfCn.m) = • f (54-18) 

If the wave lengths which contribute appreciably to the field are all large 



Sbc. 54] 


ABSORPTION AND EMISSION OF RADIATION 


455 


compared with the atomic dimensions, we can treat the vector a in first 
approximation as constant over the neighborhood of the atom and can 

replace by the value of f(^ — at the origin, which we take 

to be at or near the c.enter of the system. This approximation is equiv- 
alent to neglecting the absorption due to the magnetic dipole moment 
and the electric quadrupole moment in comparison with that due to the 
electric dipole moment (cf. Sec. 54g). Then 








(54*19) 


Here is the Heisenberg matrix of py for the unperturbed system. 
It follows from the first of the Hamiltonian equations (45*4) that 


Hence 


H«) — p.yU) = —IMl 

- c c dr*" 


(54-20) 

(54-21) 


where is the Heisenberg matrix of the y component of the electric 
moment. Carrying out the indicated differentiation with respect to the 
time, we obtain 

Hnn,m) = ^ (54-22) 


It follows from this equation that the elements of between states 
of the same energy are all zero. With the aid of Eqs. (53*11) and (53*17) 
we infer that dipole radiation produces no transitions between different' 
states of the same energy. 

The first-order perturbations of the wave functions are now deter- 
mined by Eqs. (53*9), (53*12), and (53*13): 


Ti{dy^Hn,m\ = - E„)mD^l'Kn,m)dt'. (54-23) 

Let 


J'nm 



Dy{n,m) = e“*’""”*Z)“>(n,»i) = J* 4>n*I>v'Pmdr. 

(54-24) 



456 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


The matrix Dy is a Sehrodinger matrix based on the arbitrary ortho- 
normal set of eigenfunctions of introduced in (54T3). It follows 
from the Fourier integral theorem (Sec. 9a) and Eq. (54T4) that 

^4^ = (54-25) 

Aici/ V ~ * 

As we are not interested in fit) outside the interval 0 < i we can 
assume that it vanishes outside the interval in question and can modify 
the limits of integration accordingly. Substituting v^m for p, we obtain 

(64-26) 

ZirlVnm Jo 

A combination of this result with Eq. (54-23) yields 




27^^ 


^y(,^nm )Dy{n,m), (54-27) 


where is simply a convenient abbreviation for Ti(0)^'^K 

The Born transition probability, giving the chance that a system 
initially in the state will be found after the time 6 in the state 
is now given in first approximation by (53T7). If n 7 *^ m, we find to 
terms in 

= iyX2|<^,(p„,«)|2lJ[>,(n,m)|2. (54-28) 


The chance that a system initially in the state xj/n will be found in the 
same state at the end of the perturbation is 

= 1 - ^\^'^\<t>y(^n^)\Wn,m)\^ (54-29) 

mf^n 

to terms in X®. 

So far we have had to do with the effect of a single plane-polarized 
plane-wave train on the motion and energy of an isolated atomic system. 
The important physical problem, however, is that of the perturbation 
produced by a ‘‘natural,” or chaotic, radiation field. Such a field can 
be analyzed into a superposition of plane waves moving inwall directions 
with a random distribution of phases. It suffices for our present purpose, 


however, to analyze the components of Ct at the origin into Fourier 
integrals. Thus we write 



4>x(»' ) 

2int 




2Tnv ^ 


2viv 





(t>t{v)e^^^^dv. 


(54-30) 



Sec. 54] 


ABSORPTION AND EMISSION OF RADIATION 


457 


In evaluating <t>x, <i>y, <l>z we again assume that 8 and a vanish at the origin 
outside the interval 0 < t < 6. 

The contributions of the three orthogonal components of (t to the 
matrix are readily seen to be independent. Hence 


= [0x(l'nm)/^a:(n,m) + 

+ <t>z{ynm)D^{n,m)]. (54-31) 

In this more general case ^^londiagonal elements'' of the Born transi- 
tion probability are obtained by multiplying the right-hand member of 
(54 '31) by its complex conjugate. S(5me of th(' terms are essentially 
positive, but the phases of those involving prodiu^ts of field-strength 
components in diffcirent directions, such as <t>x{ynfn)(l>y*{ynm) can have any 
value. Since we nev(n* know the phases of the components of a radiation 
field, we are interested in the mean value of for all possible phases. 

In such an average the above-mentioned cross terms drop out. If n 
is not equal to m, we have, in first approximation. 




4x^X2 


\\<t>z{vnm)mDAn,m)\'‘ + \<t>y{Pnm)\^\Dy{n,m.)\^ + 


(54-32) 


According to Plancherel's theorem 


Hence the average energy density of the field for the period of the 
perturbation 0 < ^ < 0 is 


where 


F 

At 



(54-33) 


w(v) = (54-34) 


The energy density per unit frequency interval p{v) which appears in 
Eqs. (54-2) and (54-3) is the average value of w{v) for a physically 
infinitesimal frequency interval wide enough to smooth out the violent 
microscopic fluctuations characteristic of the latter, but small compared 
with the intervals in which there is an appreciable macroscopic experi- 
mental intensity variation.^ In the case of a radiation field which is 

^ This statement involves the tacit assumption that 1/6 is small compared with 
vnm- Otherwise w(v) would not agree with the energy density observed with a normal 
photographic exposure. 



458 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


macroscopically isotropic an averaging over such a physically infinitesimal 
frequency interval yields 

(54-35) 

Hence 

= 3tfp('''‘-) f + \I>y(.n,m)\^ + lD,(n,m)p} (54-36) 

ii n 7 ^ m. 

64e. The Einstein Transition Probabilities. — It is necessary only to 
divide the expression (54*36) for the Born transition probability by the 
time interval 6 in order to obtain a formal derivation of the Einstein 
transition probability Bn~*m for transitions between uniquely defined 
(nondegenerate) states. The formulas (54*9) and (54*10) for the degen- 
erate case are then obtainable by the procedure previously employed in 
deriving (54*8) from (54*7). It will be observed that our method of 
approach gives the correct valuejs for the transition probabilities asso- 
ciated with induced emission as well as with absorption. 

The above procedure must not be accepted as satisfactory without 
critical discussion, however, since the application of the Einstein transi- 
tion probabilities to thermal equilibrium problems and to the prediction 
of the intensities of spe(*trum lines in absorption and emission involves 
much more complicated situations than that envisaged in the derivation 
of (54*36). 

In applying the Einstein theory of Sec. 54a to a sample of a pure gas 
it is customary to assume that the radiation from each molecule is 
incoherent with that from the others so that the total radiation emitted 
and absorbed per unit time is the same as for a Gibbsian assemblage of 
N independent molecules in an appropriately defined mixed subjective 
state. We have accordingly to compute the time derivatives of the 
diagonal elements of the statistical matrix p' for a Gibbsian assemblage 
of individual molecules in a chaotic mixed state. Employing the same 
procedure as in Sec. 53c we again derive Eqs. (53*26) with Nn interpreted 
as the number of molecules in the state whose wave function is 
Equation (53*27) becomes 

Ar„(fl) - Ar„(0) = - iV™(0)]{|<#.»(.„„)|'*|D.(n,m)|* 

+ \<l>v(.rnm)\^\Dyin,m)\^ + \<l>z{vnm)\^\D,{n,in)m. (54-37) 
If the radiation field is isotropic, (54-37) reduces to 

Nn^ie) - N^{0) = - iV„(0)]p(.„„)|B(n,m)l*. (54-38) 



Sec. 54] 


ABSORPTION AND EMISSION OF RADIATION 


459 


Even when the field is not isotropic, it is possible to effect a reduction 
of (54*37) similar to (54*38) by nn^ans of the theorem that the sums 

Xt\D 2210 y{n,m)\^, equal when each is 

n m • n m n m 

extended over all pairs of states n,m belonging to the same energy level. 
This theorem we now proceed to prove. 

It is convenient to return to the double indexing system of Eqs. (54*8), 
(54*9), and (54*10). The typical matrix eleirient of an arbitrary dynami- 
cal variable a is then given by 

Let denote the matrix of f/„/ rows and (jn" (‘oluinns derived from 

n " by allowing m' and m" to range through all values 
compatible with specific values of n' and n", res]XK‘tively. If we multiply 
(n'.n")j^ by its adjoint we obtain a Hermitian square matrix of 

order gn' or Qn" according as is the antecedent factor. 

In either case the diagonal sum of the i)roduct is 

; n"r/?/')|^ == Spur 

m' ni" 

By the general theorem of p. 358, Sec. 446, this quantity is invariant of a 
unitary transformation. 

Let us apply the above lemma to the partial matrices 
(n'.n")Dyj of tlic compoiierits of the electric moment of the 

molecule or atom und(T <ronsi deration. We assume that the unperturbed 
Hamiltonian Ho is symmetrical with respect to the threcj coordinate 
axes so that all three components of the angular momentum are conserved. 
The basic wave functions of the matrix representation can then be chosen 
to be simultaneous eigenfunctions of f/o, £>z. In this case the matrices 
(n'.n'OD^^ can be transformed one into the other by 

suitably chosen canonical transformations. Thus, in order to establish 
the relation 

(re 

it is only necessary to choose for U the matrix of the transformation which 
replaces simultaneous eigenfunctions of //o, by simultaneous 

.eigenfunctions of //o, £2, Clearly the matrix representation of D* 
in the latter system of coordinate functions is identical with the matrix 
representation of D* in the former system. Since the diagonal sums are 
invariant under such a transformation they must have the same initial 
values for as was to be proved. 

At this point we introduce the double index notation into Eq. (54*37). 
It becomes 



460 


STATISTICAJj MECHANICS AND RADIATION [Chap. XII 


n' m' 

X {\4>x{vn'n")\^\Dx{n'm'-,n''m'')[^ + k»(pn'n")|*l-0»(nW;n"m")l® 

+ \4>zivn'n"W\J)t{n'm'-,n''m'')\^]. (54-39) 
Let Nn denote the average number of molecules in the energy level En, 
ix.j let Nn = '^Nnm = Suiuming (54-39) over the gw> 


substates of the level En», we obtain 


NAO) - NAO) = -^r'^^^lNn-Ao) - Nn'-AO)] 

n' m* in*' 

X {|4>x(Pn'««)l“|I>x(nW;n''m'0l^ + \<l>yiu,.>n")\Wn'm';n"m'')\^ 

We assume, as in Sec. 545, that the population Nnm of the different sub- 
states of any energy level En are equal. Hence the quantity 

-- Nn"m<0)] 


is actually independent of the indices m' and m". It can be written as 
1^- — j and taken out from under the signs 


In virtue of the diagonal sum rule the above equation reduces on averaging 
over a chaotic assemblage of radiation fields to 


Nn<e) ~ Nn"{0) 



Nn<0) 

gn' 


Nn<0) 

On" 


m' m" 


(54*40) 


Recalling that the time interval S is by hypothesis an infinitesimal 
from the large-scale point of view, we adopt the notation ANn"/^t for the 
ratio [Nn"{6) — Nn"{0)]/6. The theoretical expression for the mean rate 
of increase of Nn" in a chaotic radiation field now takes the form 


^ ~ (54-41) 

where Bn'~^n" and Bn"-^n' have the values of Eqs. (54-9) and (54*10) and 
so are to be identified with the Einstein transition probabilities for 
induced jumps^ between the energy levels En' and En"* 

^ We retain the language of the Bohr theory and speak of quantum jumps” 
and ‘‘transition probabilities” in spite of the fact that the radiation field produces 
only continuous changes in the wave functions describing the assemblage under 



Sec. 54] 


ABSORPTION AND EMISSION OF RADIATION 


461 


The complete Einstein law includes the spontaneous transition 
probability from the energy level En' to each lower level En^- 

With the addition of terms of this type the expression for ANn"/^t 
should be 

all n' 

+ 2 - 2 (54-42) 

En->En" En><En» 

where has the value (54-8) in virtue of (54*5). As previously 

stated, a complete quantum-mechanical derivation of (54*42), inde- 
pendent of the quasi- thermodynamics theory of Einstein, would treat 
the field, as well as the molecade which interacts with it, on a quantum- 
mechanical basis. 

Our present theory is not only weak in its failure to give an ac.count 
of spontaneous emission. It treats the radiation field as a known external 
perturbing influenc.e and so really fails to tell us anything about the 
reaction of the molecule on the field. We must infer from the general 
law of the conservation of energy, however, that the energy absorbed 
or emitted by the gas is compensated by energy lost or gained by the 
radiation field as the case may be. Evidently the phases of the radiation 
waves si)ontaneously emitted must be independent of the external 
radiation-field phases. On the other hand classical theory would suggest, 
and experiment verifies, that the energy absorbed or emitted as the 
result of induced transitions is compensated by spherical radiation 
wavelets coherent with the inducing wave system and producing a damp- 
ing of the primary waves as they pass through the gas. Hence the 
terms of (54*42) involving An'~>n" and are associated with the 

emissivity of the gas, while the terms in Bn'-yn" and are associated 

with the absorption coefficient. 

In applying our results to the study of the radiation absorbed and 
emitted we have to remember that the spectrum line associated with any 
pair of energy levels En'^Ew' is not infinitely narrow, as our approximate 
theory would indicate, but has a finite frequency breadth, say 27?. We 
accordingly introduce the emissivity per unit volume per unit frequency 
interval^ €^, and identify the integral of this quantity over the line 
breadth with the total energy radiated per unit volume as a result of 
spontaneous transitions from one level to the other. Let the numbers 

consideration. Any finite change in a wave function involves a redistribution of 
probability for the eigenvalues of an observable a which suggests the existencic of 
jumps from one eigenstate to another. Thus the idea of discontinuous transitions gets 
into the theory in the interpretation of the wave function as a probability amplitude. 

1 Cf, M, Planck, Wdrinestrahlungy p. 7, Leipzig, 1933. 



462 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


Nn' and Nn" be the populations per unit volume of the corresponding 
levels. Then the integrated emissivity for the line is 

€(n',n") = J = Nn^An'-^n^hPn'n"- (54*43) 

The energy absorbed per unit volume per unit time in any given fre- 
quency range vi < v < vz is equal to c p{v)avdv, where a„ is the absorp- 
tion coefficient. The integrated value of the absorption coefficient 
over the spectrum line is therefore 

a(n',n") = J aAv = (54-44) 

Vn>n"-~V 

Experimental measurements of the intensities of spectrum lines in emis- 
sion and absorption give the empirical values of e(n',n") and a(n',n")i 
as the case may be. 

64f. Spectroscopic Stability. — In using p]qs. (54*8), (54*9), and 
(54*10) for the calculation of the intensity of a spectrum line due to 
transitions between degenerate energy levels we can employ any set of 
orthogonal base functions which are eigenfunctions of the Hamiltonian 
of the molecule or atom under consideration. To pass from one such 
scheme to another, one must subject the matrices 
(n',n")D^ to canonical transformations which leave the diagonal sums 
involved in the above equations unchanged. 

When a weak constant perturbing field is applied to an atom or 
molecule, the energy levels and associated spectrum lines are ordinarily 
broken up into fine structures. If the field is weak enough, it will be 
legitimate to compute the intensities of these lines using the zero-order 
wave functions ,of perturbation theory. These wave functions are 
always possible wave functions for the unperturbed atoms. The total 
intensity of all components of a multiple line created by the application 
of the field will be obtained in first approximation by a sum similar to 
that given in (54*8). Thus in the limit, as the field approaches zero, the 
sum of the intensities of the fine-structure components must approach 
the intensity of the unperturbed line. This is the principle of spec- 
troscopic stability. 

Magnetic Dipole and Electric Quadrupole Radiation.^ — In 

computing the elements of in Sec. 54d we treated the vector poten- 
tial a as constant over the atomic system. This approximation gives 

» For a correspondence principle treatment of electric quadrupole radiation, 
including a derivation of the selection and polarization rules in an external magnetic 
field, see A, Rubinowicz, Zeits, f. Physik 61, 338 <1930); Rubinowicz and Blaton, 
Ergeb, d. Exctd, Naiurms%. 11, 181 (1932); Condon and Shortley, T.A.S.j Sec. 6^ 
Formula (7), p. 96 of T.A,S, is misleading as the dyadic i.e., our Q, should be 
replaced, by our 0 (qf. Eq. (64-65)J. 



SBC. 54] 


ABSORPTION AND EMISSION OF RADIATION 


463 


the contribution of the electric dipole moment D to the transition prob- 

ability. To carry the computation one step farther we imagine Ct 
expanded in power series in x^y^z and keep the linear terms as well as 
the constant one, discarding the rest. The new terms give the con- 
tributions of the magnetic dipole and electric quMrupole radiation to the 
transition probability. They are practically always negligible in com- 
parison with the dipole terms unless the latter vanish as a result of a 
selection rule. Hence we here neglect cross terms and calculate the 
transition probability due to the linear terms alone. In place of (54 *16) 
we write 



with the understanding that the derivatives of (t are to be evaluated at 
the origin (ie., at the center of mass of the system). 

Hi is linear in the elements of the matrix 




aox 

sax 

dx 

dy 

dz 

dtt. 


dQ,j, 

dx 

9y 

~dz 

saz 

aa* 

daz 

dx 

dy 

dz 


This matrix is at once resolvable into the sum of matrices S« and Sa 
symmetric and antisymmetric respectively with respect to a trans- 


position of rows and columns. With the aid of the formula, 5C = curl a, 
we obtain 


S« = 



d(Xx 



i/aax 

4_ 

aav\ 

i/attx 

4 . 

aaA 

dx 



2\dy 


dx ) 

2 V dz 


ax/ 

ifdax 

1 

daA 

dOy 



i(day 

4- 

aax\ 

2\ dy 

-r 

dx ) 

By 



2\ dz 


By} 

i/aax 

1 

daA 

l/aov 

1 

aaA 

da. 



2 V dz 

+ 

dx ) 

2\ dz 

“T 

By} 

dz 




0 

1 

1 1 


2 ^ 

1 


1 „ 

2^' 

0 

-gXx 

-lx. 

Ixx 

0 


• (6446) 



464 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


In (54.45) we next replace ea(*h element of S by the sum of the correspond- 
ing elements of S, and S«. H\ is thereby resolved into the sum of two 
terms Kq and given by the equations 



The coefficient of each component of X is readily seen to be tlie negative 
of the corresponding component of the magnetic dipoles moment vector 

operator 9fn [cf. Eq. (49T3)]. Hence K^c reduces to the form 

XiiTac = (54-49) 


Here 5C denotes the magnetic intensity at the origin and is a function of t 
only. The contribution of Kx, to the matrix element is 

accordingly 

^[3C*(<)3n:x(n,m) + + K^{t)'m.,{n,m)]. 

A 

It follows from Eq. (53*15) that the corresponding contribution to 
t.e., is 

^^ (^;^) ~ "“j^[T*(^»»»)3nita;(/2';Wl) "f" 'yy(Pnm)2niIy(/2';W?') “b 'y«(^nw)91Ia!(^;^)] 

(54*50) 

where 


X7(^'nw) == 


X7(j'nm) is the Fourier transform (c/. p. 36 of Sec. 9a) of the field 5C 
at the origin, evaluated for the frequency Pnm. To derive the correspond- 
ing Einstein transition probability we have to take the square of the 
absolute value of F^^^(n,m) and average over all chaotic radiation fields 
consistent with a given macroscopic spectrum. Terms involving cross 
products such as yz{vnm)yviynm) are eliminated by this averaging process. 
Moreover, since the mean-square intensity of the magnetic field for any 



ABSORPTION AND EMISSION OF RADIATION 


Sec. 54] ABSORPTION AND EMISSION OF RADIATION 465 

frequency is equal to the mean square of the eleetri(! field for the same 

— > — ♦ 

frequency, we can replace j{vnm) by 4>{vnm) in the final result. Thus 

Eq. (54*50) is equivalent to (t>(vn,r,) * ^PTiiriyfa). This is 

the same as (54*31) except for the substitution of the matrix of the 
magnetic dipole moment for the matrix of the electric; dipole* moment. 
We infer that the Einstein transition X)robability due to magnetic dipole 
radiation acting alone is 


Bn'. 



|m(/i 


'/z",m")|^ 


(54*51) 


In applying this formula the magnetic moment of electron spin is to be 
added to the orbital moment here considered.^ 

Let us turn our attention next to the electric (luadrupoh* moment 
term Kq. In the following formulas all matrices aie, like of the* 
Heisenberg type. 


\Kq = -- 


1 da. 


c\ dx 


I day's? I da^'S^Cj 

^ x/p.,- + y/Pw + 

j 


+ 




2c[ dx dt^ ’ ’ ^ dv dt.^ ^ 


dy dt^ 

fdar . dg 
\ dx 


+ ■ ■ ■ ■ (54.52) 


Let Qxx, Qxy, etc., denote the comix)nents of the electric quadru])ole 
tensor ^ejXj^, * ’ * . Carrying out the differentiations with 

3 3^ 

respect to t indicated in (54*52), we obtain 


+ ~I^Qr.n,rri) + (-|- + +•••]• (54-53) 

and 

+ • • • (54-54) 

' C/. H. C. Brinkman, Thesis, Leiden, 1932. 



466 


STATISTICAL MECH^lNICS AND RADIATION [Chap. XII 


In this last equation the matrix elements of the quadrupole moment tensor 
are of the Sehrodinger, time-free, type. 

Let Gxt{v), OxyM, etc., denote the Fourier transforms 


Gx.(p) 


J_ „ \ dx /o' 




Gxtiv) 



dx /o 


e^‘'‘dt: • • • 


where the subscript zero indicates that tiie quantity in question is to be 
evaluated at the origin. Equations (5317) and (54-54) now yield 


4ir^p “ 

~ Ji^(^ \G xx(,^nnOQxx(lIf^^ Gyy{Vnvi^Qyy{V'f'f^') -f- * * * 

+ Gxy(vnm)Qxy(n,m) + * * * | 2 . ( 54 - 55 ) 

Taking mean values over a chaotic assemblage of isotropic radiation 
fields, we note that 

« 

^ 

GxaPyy* = G xx*Gyy ~ GyyGzz* — Gyy*G zz = = Gzz*G xx- 

The mean values of such products as GxzGxy*f OxyGyz"^ are zero. Since 

div a = 0, \Gxx + Gyy + Gzz\^ vanishes. Hence GxxGyy* = 

The writer is indebted to Prof. J. H. Van Vleck for the suggestion 
of the following scheme for relating \Gxy\'^ to \Gzz\“. The latter quantity 
is invariant of a rotation of the axes. Let Z, ??i, n denote the direction 
cosines of a new z axis, say 2 :', with respect to the original axes a:, y, z 
respectively. One can then equate \Gzz[^ with its transform \Gz>z>\'^ 
and demand that the equation shall hold identically in Z, m, and n. We 
obtain 


\GzzV- - \Gzz\^\{l^ + m^ + + nH^)\ 

+ \GZ\Kl^m^ + + nH^), 

Since (Z^ + = 1, the above relation requires that 

= 31^2, 

Equation (54-55) becomes 
4^4 y 2 i 

+ IQ..I* + 3(|Q^|» + 

+ I0«P) - + Qss*Qyy + QyyQ,.* + ’ * ' )|- (54-56) 

Denoting the quantity by A, we have 

I 

^ \Qxx{n,m) + Qyy(n,m) -4- Q«(n,m)j® = |A(n,m)|2. 



467 


Sec. 54] ABSORPTION AND EMISSION OF RADIATION 
With the aid of this relation we can reduce (54-56) to the form 

0^4 j, 2 { 

4 " \Qyy{'fly'^^)\^ 4 * \Q zz(jlyW.)\^ 

■f 2\Q^(n,m)\^ + 2\Qy,{n,m)\^ + 2|y„(n,w)|2 - ||A(n,m)p|- (54-57) 

As a final step in evaluating th<! Einstein transition probability for 
an isotropic radiation field we must relate with the energy 

density p{v). It is necessary for this purpose to describe the perturbing 

field as a superposition of plane waves. We accordingly express a in the 
form of a multiple P'ourier integral: 


where 

Then 


^ -r ” 

aix, y,z,t) = fff lAi(a)e‘*’ + Ai{a)(ri^]da^aycUr, (54-58) 


(p = %r[x<rx + xcy + Z(Tz — vt]. 


in 


)e^ — A2z((^)c~*^]daxCl,ayd<Tz, 


Let B(<r) denote the vector yli(<r) — A 2 icr). Then 

^ = 2Tri j* j* J* (F zB xdtTyda z. 

By PlancherePs the^orem 

<Fz^\Bz{(t)\ '^d(Txd(Jyd(Tz 

” ^ ^ ^ (54*59) 


Here dQ denotes an element of solid angle in (r-space. 

We are interested solely in the field at the origin between f = 0 and 
t = d. According to Huygens' principle this field is determined by the 
field at ^ = 0 inside a sphere of radius cB drawn about the origin as a 
center. The average value of {dGig/dzY for this initial spatially dis- 
tributed field can be identified with the average value of {ddz/dzY at 
the origin for the time interval 0 < < < ^. We shall in fact identify 
the mean contribution of each portion of the spectrum to one of these 
average values with the corresponding mean contribution to the other. 
First of all 


■ sIj'XXX 



468 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


Combining (54-59) with (54-60), we obtain 




(o') I 12. 


Equating the statistieal mean values of the integrands for a chaotic 
assemblage of isotropic radiation fields, we have 




(54-61) 


A similar tn^atment of -- yi(‘lds 
cdt 


U'O ■ tX J J J 6 

j <r^|B.(a) I Q. (54-62) 


Here <t>z{u) is defined by (54-30), In view of the Work of Sec. 54d we can 
identify X‘|05 (j^)|‘^ with ^^p(i^). Equations (54-61) and (54*62) now give 

o 




(54*63) 


Since div GL is zero, the vectors Ai, A 2 , and B are perpendicular to a. 
If \[/ denotes the angle between B and the plane through a and the z axis, 
while X denotes that between o' and the positive z axis, we have Bg = 
\B\ cos ^ sin X* Hence 

|^,|2 = |5|2 cosV sin^x = yi\B\'^ sin^x. 

\B\^ will be independent of the orientation of the vector o'. Thus 


|0.(p)|“ = ^^p(p)^ - 

^ jr .sm^x dx 

On removing unnecessary absolute value signs (54-57) now yields 

= ^i^p(Pnm)P»m*|^Q®i(n,w) — 






p(Pnm)j'nm MO XX (n,m) 


+ ( Qvy{n,m) 




0«(n,m) 


+ 2Q*,(n,m)^ + 2Qyt{n,mY -|- 2Q,x(n,' 


(54-64) 



Sec. 55] 


ELEMENTARY SELECTION RULES 


469 


Let QinyTii) denote the tensor whose matrix is 


Qxx — 

Qxy 

Qz. 

Qyx 

Qvv 

Qyz 

Q^x 

Qzy 

Qzz — }iN 


Let |Q(n,m)|2 denote the sum of the squares of the nine elements of 
(54*64) reduces to 

Using the double index notation approj)riate to degenerates enesrgy levels 
and summing over all types of transition from one such level to another 
we derive the Einstein transition probability 



m' m' 


(54*65) 


The transition probabilities JUid An'~^n" follow direcdly from the 

above with the aid of Eqs. (54*4) and (54^5). 


66. SOME ELEMENTARY SELECTION RULES FOR ELECTRIC DIPOLE 

RADIATION 

55a. The Harmonic Oscillator. — The usual selection ruhjs which 
distinguish “ allowed and ^^forbidden^^ jumps and permit us to predict 
the nature of the radiation spectrum of an atomic* systenn are dc^rivable 
from the fundamental intensity formulas for dipole radiation, (54*9) 
and (54*10). Let us consider first the ideal linear oscillator of S('c. 20. 
We assume that the oscillator has a charge* e and proc(*ed to compute the 
electric moment matrix, which can be treated as a scalar if the motion 
is along one of the coordinate axes: 

Z)(n,n') = 

The substitution 

lAn = CnH„{^)e 2, f 

yields 

D(n,n') =r 

The recurrence formula (20*8) gives 
D(n,n') = 

The normalization-orthogonality rule (20*10) shows at once that bitiyn') 
vanishes unless n' = n ± 1. Spontaneous jumps from one state to 



470 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


another with emission of radiation will then occur only from any given 
level to the next lower in the series. The only frequency radiated is 
(J?n — En^i)/h = Vc, the frequency of the classical oscillation. The 
intensity of emission per atom in the initial energy level En is 

= ^y.*\D(n,n - 1)|S (55-1) 


where D{n,n — 1) is readily reduced to 


Din,n - 1) = 


2k ) 






En - E, 


2k 


V^Xn-X-Xn). (55-2) 


Here Xn and Xn-i are the phase constants of Cn and Cn-v The reader will 
readily verify that the intensity of radiation so calculated reduces to the 
classical value if Eo is neglected in comparison with En- 

66b, Selection Rules for the Two-particle Problem. — In the case of 
a two-particle problem with a homogeneous external magnetic field we 
have learned from Sec. 49 that the wave functions in the field are to a 
first-order approximation, the same as those obtained by separating the 
variables in spherical coordinates with the z axis in the direction of the 
field. Hence the matrix elements of the electric moment calculated 
from these functions can be used for computing both the intensities and 
polarizations of the components of the Zeeman pattern and for the 
intensities of the unperturbed lines. 

Denoting the eigenvalues of £« and as usual by 


£/ = 
(£2)' = 


md 


4 t‘ 


tn = 0, ±1, ’ • • i Z 

I - 0, 1, 2, • • 


we shall prove the selection rules 

m—^m,m±l] I I ± 1, 


(55-3) 


(55-4) 


In appropriate spherical coordinates the classical formulas for the 
components of th(^ electric moment are 


Dx = er sin B cos ip, Dy = er sin 0 sin D, = er cos 6. 

It is convenient to work with the linear combinations 


SO = Z>* + iDy = er sin 0 
©t z= — iD^ = er sin 0 

instead of Dx and Dy themselves. SDt is then the adjoint of SD. Evi- 
dently 

\D\^ - |I>x|^ + \Dy\^ + W = (55-5) 



Sac. 65 ] ELEMENTARY SELECTION RULES 

Using the normalized wave functions of Sec. 28, we have 


471 


D,{n,l,m; — S>T{n,l; n',r)Ze{l,m; 

n' , 1 ' ,m') = a)r(n,Z; n',l')£)e(l,m; V ,m')'Si^{m-,m'), \ (55-6) 

T>^{n,l,m; n',l',m') = 3 Jr(n,i; n',l') T>»^{l,m] Z',m')SD/(m;m'),j 

where 


Sirin, 1; n',l') = ejT R„iR„>rrMr = eJ^“r(Rni(Si«'i'dr, 
Zg(l,m;l',m') ^ sin 6 cos 6 dd, | 

Sieil,m,;V ,m') = Sig^il,m;l' ,m') = sin“0 d8,j 


Z^im;m') 


1 r*' 

1 . 

^ I m'-f 

2^ Jo 


1 



(55-7) 

(55*8) 

(55*9) 


Equations (55*9) show that all components of the D matrix vanish unless 
m' = m or m ± 1. Thus the m selection principle drops out of the basic 
formulas without computation. 

The principle of selection for the azimuthal quantum number is 
derivable from an examination of the integrals Zo, SD^, SD 0 +. In evaluating 
eac^h integral we give m' the value required to prevent the cofaotor 
Z^, as the case may be, in Eqs. (55*6) from vanishing. Thus, 

introducing the symbol Ci,m for the normalizing factor 

i{2i + m - hi)! i^ 

)■ 2(Z + H)! / 

of we have 

/ -f 1 

j ^ m| (ir)dic, (55T0) 

Formula (F-10), Appendix F, permits us to express xPi,\Tn\{x) as a linear 
combination of Pz_i,|m| and Pm-i.imi* In view of the orthogonality 
equation (F-15) it follows that Ze{l,m;V,m) vanishes unless T = Z ± 1. 
Similarly + 1) and — 1) vanish unless Z' = Z ± 1. 

Thus all components of the electric-moment matrix vanish for transitions 
in which Z changes by any amount other than ±1. This proves the 
r election rule for the azimuthal quantum number Z. 

66c. Fine Structure and Polarization of Spectrum Lines in Simple 
Zeeman Effect. — Consider next the application of the theory of absorp- 
tion to hydrogenic atoms without electron spin in a uniform magnetic 



472 


STATISTICAL MECHANICS AND RADIATION [Chap. XII 


field of strength 3C. Taking the z axis in the direction of the field we 
learn from Sec. 49c that the energy of the state 

im<p 

is where j/jc i« the Larmor frequency, \e\3C/^/xCy and 

is the unperturbed energy. Let denote the frequency of the 

unperturbed emission and absorption line due to transitions between 
the states Un,i,m and Un\v,m>- We assume that > En> so that 

«(0) = (gn -- En') 

h 

The frequency of the perturbed line due to this special transition is then 
+ ('^ Every unperturbed line is accordingly split by the 

field into three components whose frequencies are and 

coming r(?spectivoly from the three types of allowed jumps 
in which m — m' has the values ^1, 0, and +1. 

These three components show three distinctive types of polarization, 
as in the classical theory of the simple Zeeman effect. In order to 
demonstrate this we rewrite Eq. (54*31) in the form 

\F<»\k,k’} = - i<l>uMMk,k') +l[<l>.(ykk') 

+ i<t>vMWik,k') + <l>^MZik,k')y (55-11) 

In the case of the central component of the absorption triplet = m and 
the matrix components of SD and SDt vanish. Thus only the component 
of the electric force of the impressed wave in the z direction is effective in 
producing absorption of this freiiuency. If light is traveling in the direc- 
tion of the lines of force, or, if it is traveling perpendicular to the lines 
of force and is polarized with the electric vector at right angles to the 
lines of force, there is no absorption. This is the characteristic of the 
central component in the classical theory. 

Consider next the high-frequency component v = + roc. This is 

associated with the matrix clement — 1), the cor- 

responding matrix elements of 3) and Z being zero. Equation (55*11) 
reduces to 

+ i<l>uivw)]^i{k,k')y (65-12) 

If the light of frequency v = + vx is an approximate plane wave 

traveling perpendicular to the lines of force, say along the x axis, only 
the component polarized with the electric vector perpendicular to the 
field is effective in producing absorption. If the light of frequency 



Sec. 65] 


ELEMENTARY SELECTION RULES 


473 


+ vx travels along the lines of force and is circularly polarized in the 
sense of a left-hand screw the phase of the y component of the electric 
force will be 90° ahead of that of the x component and <t>y = +i^x. 
Hence <t>x + i4>v vanishes and the light is not absorbed. Maximum 
absorption for a given value of the intensity occurs when the light is 
circularly polarized in the sense of a right-hand screw so that 

<t>x -f i<t>y , 

2 

The low-froqiiency component has, of course, the reverse behavior with 
respect to the two types of circularly polarized light propagated in the 
direction of the lines of force. Thus the Zeeman ])attern in absorption 
for our model atom has precisely the same properties as in the familiar 
classical theory of the simple Zeeman effect for the same model. 

The polarization characiteristics of the emission triplet can be derived 
from those of the absorption triplet with the aid of Kirchhoff's law or the 
principle of detailed balancing. 

The s(^le(d/ion rules for many-particle problems require a rather 
elaborate analysis and will be worked out in Chap. XIV after we have 
developed methods for dealing with electron spin. 



CHAPTER XIII 


INTRODUCTION TO THE PROBLEM OF ATOMIC STRUCTURE: 

ELECTRON SPIN 


56. THE ATOMIC PROBLEM AS A TWO-PARTICLE PROBLEM 

56a. The Empirical Basis for the Idealized Bohr Atom Model. — Our 

present quantum-mechanical theory of the extranuclear structure of the 
atom^is an outgrowth and refinement of the Bohr interpretation of the 
optical spectra of the atoms. A number of important questions which 
arise in both theories were settled by Bohr^s early investigations. On 
this account our approach to the present theory is one which leans heavily 
on the historical development of the earlier work. 

In the sketch of the Bohr theory in Sec. 46 it was pointed out that, 
whereas strictly speaking the theory is applicable only to systems whose 
classical motions are multiply periodic, the original Rutherford model 
of any of the many-electron atoms gives rise to classical motions which, 
in the vast majority of cases, are not multiply periodic, even when the 
radiation resistance forces are neglected. In fact it is easy to see that 
spontaneous ionization would almost certainly be the speedy fate of a 
classical Rutherford atom started oif with the energy of an empirical 
stationary state. Hence it was necessary to introduce an idealization 
of the Rutherford model whose motions are multiply periodic. 

In devising this substitute model Bohr was guided by the empirical 
data. The alkali metals formed a group of elements of primary interest. 
They were known to carry one easily detachable electron — the valence 
electron of chemistry. They have particiriarly simple spectra which — 
ignoring fine structure — suggest the motion of a two-particle system, or 
of a single electron in a central force field. The energy levels are grouped 
together in a number of series known to spectroscopists as sharp series, 
principal series, etc. The spectra show that transitions do not occur 
between the different levels of the same series, or between all different 
series. They are subject to a principle of selection, and if we assign 
the ordinal numbers I as follows, 


Sharp 

Principal 

Diffuse 

Fundamental 

0 • 

1 

2 

3 


it turns out that the selection rule is exactly that derived in Sec. 551> 
for the azimuthal, or angular-momentum, quantum number in a two- 


474 



Sec. 56] 


THE TWO-PARTICLE ATOMIC MODEL 


475 


particle problem, viz., AZ = ±1. The development of the theory will 
show that the ordinal number I can in fact be identified with the angular- 
momentum quantum number of the valence electron. 

Each series of levels is composed of an apparently infinite number of 
members whose energies approach a well-defined limit as we ascend the 
series. To each level there is assigned an ordinal number n giving its 
place in the series and beginning with an initial positive integer no 
which was formerly assigned somewhat arbitrarily, but which can be 
chosen in such fashion as to give n a theoretical significance. The 
symbols s, p, d, f, g, • • • have been assigned to the successive series 
arranged as above in the order of increasing I values. The individual 
states are indicated by ns, up, nd, • • • . Thus 3p denotes the state 
corresponding to a principal-series energy level with the ordinal number 3. 

Taking the limit of any one of these series as the zero level of energy — 
not otherwise defined by spectroscopic data — we can describe the series 
by one or the other of the following well-known empirical formulas: 

Rydberg formula: 

Nhc 

Eru = - no + 1, * * • (66*1) 

Ritz formula: 

Here N is the Rydberg constant, sensibly the same as for the hydrogen 
series, and ai and bi are empirical constants for any one series, but func- 
tions of I if we wish to think of all the series together. The position of 
the lowest energy level of any given series fixes the value of the difference 
no — qi in the Rydberg formula for that series. The value of no was 
originally chosen for each series so as to make the assoc^iated value of ai 
as small numerically as possible. The Ritz formula is evidently the 
more accurate as it involves two adjustable constants and includes the 
Rydberg formula as a special case. 

These energy-level formulas bear a striking resemblance to the energy 
formula for the hydrogen atom — ^a resemblance reinforced by the 
observation that for large values of I the quantities ai and hiEni can be made 
very small in comparison with unity by proper choice of the corresponding 
minimum integer no. In fact there is a critical value of I for each alkali 
metal beyond which all energy levels are in close agreement with cor- 
responding hydrogen levels. This rule is illustrated by Fig. 23, which 
shows a chart of the energy levels of the optical spectrum of i^odium with 
the energy laid out as abscissa and the azimuthal quantum number as 
ordinate. Vertical dotted lines indicate the energy levels of hydrogen. 

These facts suggested the possibility of developing an approximate 
account of the spectrum of such an atom as sodium on the hypothesis 



476 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


that the different energy levels represent different motions of the valence 
electron, the other particles forming a more or less stable symmetrical 
core or atom body. The assumed symmetry and stability of the central 
group are empirically reasonable in view of the fact that this group 
contains the same number of electrons as the very stable neon atom 
which precedes sodium in the periodic table. A similar statement 
applies to every alkali atom. The simplest assumption was then that 
in effect the valence elcH'troii moves in a (*entral force field compounded 
of the field of the nucleus and the average' field of the core. If the motion 
of the nucleus is taken into consideration, one has a typical two-particle 

E 



Fig. 23. — Diagram of the energy levels of the optical spectrum of sodium. {After Bohr.) 

problem like that of Sec. 28. The substitution of thi.s two-partiole 
model for the actual complex atom implies a definite break with the 
third Bohr postulate, but such a break was a priori inevitable, as we 
have already seen. It wa‘« immediately justified by the light which it 
threw not only on the spectra of the alkalies but also on the general 
problem of atomic structure. 

We shall not attempt here to follow in detail the application of 
Bohr’s theory to this central-field problem but will develop the same idea 
by the method of wave mechanics. Our first objective will be the 
derivation of the Ritz formula (56-2) from our model. 

The essential feature of this treatment of the sodium atom as a two- 
particle problem is that we neglect the details of the interaction between 
the inner electrons and the valence electron, considering only the average 
interaction. Translating this statement into the language of wave 
mechanics, we assume that to a first approximation the wave fimetion 
of the atom can be resolved into the product of two factors, one of which 



Sec. , 56 ] 


THE TWO-PARTICLE ATOMIC MODEL 


477 


involves the coordinates of the valence electron only, while the other 
describcH the state of the core and depends only on the coordinates 
of the inner group of electrons. We assume that the latter function is 
similar to the wave function of the normal state of the unperturbed ion 
obtained by removing the valence electron, and concern ourselves only 
with the former function. 

The Schrodinger equation which determines the contribution of the 
valence electron to the atomic energy will then be of the form of the two- 
particle equation (28T). In order to determine the qualitative charactei 
of the potential of th(‘ field in which the valence elecitron moves, we note 
that by elementary electrostatics the field of a uniform spherical shell 
of electricity is zero at interior points, and at exterior points is equal 
to the field of the same total charge when concentrated at the center of the 
shell. In calculating the potential we treat the core as a spherical 
distribution of charge of total amount (Z — l)c, where Z is the atomic 
number. We shall suppose that practically all of this charge lies inside 
a radius ri of the order of magnitude of the kinetic- theory radius of the 
inert-gas atom of atomic number Z — 1. Then, if 7(r) is the net poten- 
tial energy due to the nuclear charge and the core, we have 


and 


^ = (56-3) 


There are a number of ways of estimating appropriate values of F(r) for 
the interior of the core, of which the best known are the method of Thomas 
and Fermi ^ and the method of the Hartree self-consistent field. For our 
present qualitative purpose, however, it will suffice to assume that 
rHV/dr varies linearly with r between the limits 0 and ri. Figure 24 
shows a comparison between the potential-energy curves for the hydro- 
gen-atom and the sodium-atom model (Z = 11), the latter worked out 
on the above simple basis, assuming that, for sodium, ri is twice the 
radius Uo of the innermost Bohr orbit for hydrogen. 

The reason for this choice of ri is as follows. We have already 
observed that the energies of states of the alkali atoms with large values 
of the azimuthal quantum number I are very nearly coincident with 

^L. H. Thomas, Proc, Cambridge PhU. Soc. 23 , 542 (1926); E. Fermi, Zeits. f. 
Phyaik 48 , 73 (1928), 49 , 550 (1928). A full bibliography of the literature of the 
Thomas-Ferrni and Hartree methods is given by L. Brillouin in the pamphlets 
^‘L^Atome de Thomas-Ferrni and “Les Champs ‘Self-Consistents’ de Hartree et de 
Fock,” JParis, 1934. 

*D. R. Hartree, Proc. Cambridge Phil. Soc. 24 , 89, 111, 426 (1928) ; J. C. Slater, 
Phys. Reo. 86 , 210 (1930); V. Fock, Zeits, f. Physik 61 , 126 (1930). 



478 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


corresponding hydrogenic energies. We shall refer to states of this 
kind as hydrogenic stateSy or hydrogenic orbits. We assume that in states 
of this kind the electron spends practically all its time in the Coulomb 
field region outside the core has values inside the core very small 
compared with those outside). There will be an effective “distance 
of closest approach to the nucleus for such states, which can be cal- 
culated as if the valence electron moved in a pure Coulomb field. The 
minimum distance of closest approach for any hydrogenic state gives 
an upper limit to the value of ri because any electron which penetrates 
the core is sure to have its energy affected by that fact. 


0 1 2^3 4 5 



Fig. 24. — Approximate central-field potential energy for valence electron of sodium com- 
pared with potential energy for hydrogen atom. The abscissa p is r/oo. 

V\{p) =* 2/p: Potential energy of hydrogen atom in Hartree atomic units. 

F(p) : Approximate potential energy for central-field model of sodium atom in atomic units. 
0i(p) * Fi(p) + (1 -H M)*/P- Effective potential energy for hydrogen atom with I -= 1. 
4*{p) — F(p) + (1 -i- Effective potential energy for sodium atom with Z « 1* 

Pauling and Goudsmit^ have made a rough estimate of the distance of 
closest approach which gives the value oo (c/. p. 158) for principal-series 
orbits, 3ao for diffuse-series orbits, etc. Empirically the energies of the prin- 
cipal-series states in sodium show marked deviations from the hydrogenic 
values, while the diffuse-series states show very slight deviations. Hence 
we can be reasonably sure that the radius of the core of the sodium atom 
is between the values ao and 3ao. The choice of 2ao for ri used in pre- 
paring Fig. 24 is accordingly reasonable. 

66b, Derivation of the Ritz Formula. — In order to place the empirical 
formulas (66*1) and (56*2) on a theoretical basis we separate the variables 
as in Sec. 28 and apply the B. W. K. method to the radial equation (28*19). 

^ L. Patjmng and S. A. Goudsmit, Th^ Theory of Line Spedroy p. 38, New York, 
1930. 



Seo. 56] 


THE TWO-^PARTICLE ATOMIC MODEL 


479 


Following an ingenious scheme due to Bohr^ we shall base our work on a 
comparison between the actual problem and the hydrogen-atom problem 
for the same energy and quantum number. 


Let ^{Tfl) denote the effective potential energy |^F(r) + - j 

for the valence electron of an alkali atom with azimuthal quantum 
number I [C/. Secs. 2lh and 28, Eq. (28*23)]. k denotes the quantity 
as in previous chapters. Let 4>fi(r,Z) denote the corresponding 


effective potential energy 


r KV^ 


for a hydrogen atom. 


The expressions for the classical local radial momentum in the two cases 
are then 


p{r,l,E) = \/2n(E - 4>), p„(r,l,E) = \/2n(E - 4>h). 


We neglect the slight difference in the mass (‘oefficients ju appropriate to 
the two different x)robleins. 

The condition for the location of the discrete energy levels of the 
problem in hand is 

J = fj)(r,l,E)dr ^ (v + t; = 0, 1, 2, • • * (56*5) 

The corresponding condition for the hydrogenic problem is 


Jh = fpti{r,l,E)dr = (j-u + = 0, 1, 2, • • • (56-6) 


The latter condition is known to be equivalent to the familiar equation 
(c/. Sec. 29a), 


E = 


_Nhc 

0-H + ] + 1)^’ 


(56-7) 


whether »h is an integer or not. Let us now define a continuous function 
n*{E) (here the asterisk does not denote the complex conjugate) by means 
of the equation 


E = - 


Nkc 

vt *2 


(56*8) 


For all values of E, whether quantized or not, Eqs. (56*6), (56*7), and 
(56*8) ^lead to the relation 

ME,l) = fvME)dr = (n* ~ M ~ 0^. (56*9) 

Replacing v in (56*5) by its v^alue in terms of I and the total quantum 

^ N. Boke, Proc, London Phys, Soc. 36, 296 (1923). Cf. also J. H. Van Vleck, 
Quantum Principles and Line Spectra^ pp. 110-114, Washington, 1926. The above 
discussions are based on the Bohr theory. For a pure wave-mechanics treatment see 
J. C, Slater, Phys. Rev. 31, 333 (1928). 



480 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


number n (i.e,, t; + ^ + 1), we obtain for the quantum condition fixing the 
energy levels of the alkali-atom model 

J{E,D = fp{r^E)dr ^ {n - H - l)h. n - Z + 1, Z -h 2, • • • 

(56*10) 

The next step in the argument is to form the difference AJ between the 
phase integrals J and Jn for the alkali model and the hydrogen atom, 
using the same energy and azimuthal quantum number for each. In 
this way we obtain 

AJ{E,l) = fp(r,l,E)dr - fpn{r,hE)dr = (n - (56*11) 

E becomes an approximate eigenvalue of th(^ Hamiltonian of the alkali 
model if n is chosen to be any positive integer, and I is given a positive 
integral value or set equal to zero. 

Whereas J and Jn become infinite as E approaches zc^ro, AJ is a 
slowly varying function of the energy throughout the range of energies 
encountered in a spectrum scries (this observation is the key to the 
method). The contributions to the two integrals of the portions of the 
orbits outside the alkali core may be very large if the energy is small 
numerically, but they are equal in magnitude and cancel each other. 
The contributions of the portions of the two orbits inside the core (r < ri) 
are nearly independent of the energy. This is especially true of the 
contribution to J, which is much greater than the corresponding con- 
tribution to Jh. These statements will be evident on examination of the 
effective potential-energy curves for principal-series energy levels shown 
in Fig. 24. In this figure a horizontal line indicates the lowest energy 
level of the principal series of sodium, and gives a basis for the graphical 
evaluation of p(r,Z,£') and pn{rJ,,E), 

We conclude that AJ /h, f.e., n — n*, should be roughly constant and 
accurately representable as a linear function of E in the range of energies 
included in the actual series in question. Treating n — n* as constant, 
we get the Rydberg formula; writing 

n - n* = ^ = ai - biE, (5612) 

and substituting the resulting expression for n* into (56*8) we obtain a 
formula of the Ritz type (56*2) with values of ai and bi which have 
theoretical significance. Using the theortitical value of n — n*, we can 
choose no so that the parameter n of Eqs. (56*1) and (56*2) becomes 
identical with the total quantum number v + I + 1. 

The quantity n — n* is called the quantum defect since it defines the 
difference between the actual energy and the ideal energy of the cor- 
responding state of the hydrogen atom. It is readily evaluated in first 
approximation from any assumed potential-energy curve when the energy 



Sec. 57] THE BOHR ASSIGNMENT OF QUANTUM NUMBERS 


481 


of the state in question is known experimentally. A graphical evaluation 
of A/ will suffice for thi^ purpose. From the approximate value of 
n — n* obtained in this way and the experimental value of n* we obtain 
a trial value of n which should be approximately integral, if our potential- 
energy curve is not too bad. We then identify the true total quantum 
number of the state in question with the integer which lies closest to the 
trial value. Thus in the case of the lowest encirgy level of the principal 
series of sodium a graphical integration yiedds the value 1.08 for the 
quantum defect. The empirical n* is 2.12 and the trial value of n is 
3.20. We conclude that the “ true value of the total quantum number n 
for this energy level is 3. The rather large deviation of the trial value 
of n from the luiarest int(^ger indicates that we have chosen too large a 
value for ri. 

The quantity hi is n'adily worked out from the equation 


hi 


1 a(AJ) 
h dE 



(56- 13) 


Classically fi/p is equal to the recipro(;al of the velocity. Hence ^drjp 
is a periodic time and hih can be interpreted as the difference in the 
classical radial-oscillation periods for the laws of force of the sodium 
model and the hydrogcui atom. 


67, THE BOHR ASSIGNMENT OF ELECTRONIC QUANTUM NUMBERS 

67a. The Quantum Numbers of the Valence Electrons in the Spectra 
of the Alkalies and Alkaline Earths. — In the absence of pr(H*ise knowledge 
regarding the potential function F(r) we cannot expect an accurate 
theoretical computation of the quantum deh'ct. Even a rough estimate 
is useful, however, as a guide in making a correct assignment of the true 
total quantum number to eaidi energy level. Tliis is the more important 
inasmuch as the minimum value of n for each series is not equal to the 
theoretical value Z + 1. In the case worked out in Sec. 566, for example, 
no turns out to be 3 when Z + 1 = 2. 

Some information regarding the no values for different series can be 
obtained from a direct examination of energy-level diagrams like that 
shown in Fig. 23, thus avoiding the integrations of (56*11). We can 
assume that the total quantum number of each of the non-penetrating 
hydrogenic orbits is the same as that of the corresponding energy level 
of hydrogen. Furthermore, it follows from (56*11) that ai must increase 
as I decreases. Hence the energy must increase and decrease with I as we 
fnove along a line connecting states of the same total quaritum number in a 
diagram in which energy and azimuthal quantum number are laid out as 
orthogonal coordinates. In Fig. 23 a line of constant total quantum 
number must have a negative slope. This fact and the assignment of 
total quantum numbers for the hydrogenic orbits (Z ^ 2) suffice for the 



482 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


determination of minimum values of no for each series. Thus in the 
sodium case, no must be at least 3 for both the principal and the sharp 
series. A similar examination of the term system of lithium shows that- 
the total quantum number for the normal state of this element (the 
lowest sharp-series state) must be at least 2. More generally, if we start 
with hydrogen and go up the series of elements with a single valence 
electron, we find that the total quantum number of the valence electron 
in the normal state increases by at least unity as we pass from each 
element to its successor. A systematic theoretical evaluation of the 
quantum defects, such as that carried through by Van Urk,’ shows 
clearly that the actual increase in the total quantum number of the 
valeiKje ole(;tron in its normal minimum energy state, as we pass from 
one alkali to the next in the series, must bo exactly unity, and not a 
larger integer. 

When this increase in the minimum total quantum number with the 
atomic number was first discovered there was no theoretical principle to 
account for it. States of lower total quantum number which were 
expected th(K)retically in the heavier atoms were not found experi- 
mentally. Their absence was a brute empirical fact which gave rise 
to the first elementary form of the Pauli exclusion principle. 

Valuable supplementary evidence of an exclusion principle which 
prevents the occurrence of expected states of low energy is obtained from 
an examination of the spectra of the elements of the next two columns 
of the periodic table. Consider first the group of elements with two 
loose valence electrons consisting of helium and the alkaline-earth 
metals, beryllium, magnesium, calcium, etc. (c/. Thomsen diagram of 
periodic table on p, 489). The strong lines in the spectra of these 
elements — to be specific we shall focus our attention on the case of 
magnesium — come primarily from two systems of energy levels each 
similar in general character to the sodium system but differing in fine 
structure. For the present we ignore this fine structure except insofar as 
we apply the conventional designations singlet” and ^ triplet” to the 
two systems under consideration. The same principles of selection apply 
to transitions inside any one of these systems as in the case of sodium. 
(Transitions from one system to the other are weak.) Consequently the 
levels can be classified like, those of sodium. We are at once driven to 
interpret each system as due to the multiplicity of '^excited” states of 
one of the valence electrons moving in a central force field due to the 
nucleus and a core composed of the remaining electrons. 

It was at first diflBlcult to understand how the core could have spherical 
symmetry in this case. One would suppose that the inner core of a 
magnesium atom obtained by removing both yalence electrons would 

^ A. Th. Van Ukk, Zeita. /. Physik 18 , 268 (1923). Cf. also M. Bom. loc, 
pp. 195-198; Pauling and Goudsmit, loo, cU>, p. 40* 



Sec. 57 ) TffE BOHR ASSIGNMENT OF QUANTUM NUMBERS 483 

have the same spherical symmetry as the core of the sodium atom — oach 
having the same number of electrons as a neon atom. Adding one 
valence electron in its lowest energy state should give a complete core 
for a magnesium atom with the same symmetry as a complete sodium 
atom. These conjectures are in fact verified by an examination of the 
spectrum of ionized magnesium. But according to the Bohr theory 
the normal state of the valence electron of sodium had one unit of angular 
momentum and could not give rise to a symmetrical average force field. 
Happily, this difficulty disappears when we adopt the point of view of 
wave mechanics which assigns zero angular momentum and a spherically 
symmetric wave function to the valence electron of sodium in its normal 
state. 

The existence of the two differcait systems of energy levels suggests 
two alternative central force fields — a possibility we are not yet ready to 
interpret. This puzzling feature of the situation does not prevent us, 
however, from reading th(' primary lesson of the magnesium energy-level 
diagram, which is that the normal state to which, one of the valence 
electrons returns after a period of excitation has the quantum numbers 
n = 3, Z = 0. Using the conventional spectroscopic notation, we desig- 
nate this state by the symbol 3s where the figure 3 specifies the value of n 
and the letter s indicates that Z = 0. As the normal state of the inner 
valence electron is proved by the spark spectrum to be of the 3s type 
also, we see that it must be possible to have two electrons in s states 
of the same total quantum number. 

Objection may be taken to the above statcmient on the ground that 
when two electrons are added in succession to a doubly charged ion, 
the fact that the first goes into a 3s state when it is the only valence 
electron does not prove that it remains in a 3s state when a second valence 
electron is attached with comparable firmness. The question arises 
whether as the second electron passes from one state to another of lower 
energy (stronger binding) it does not perturb the inner electron more and 
more violently until in the end the 3s nomenclature with its implication 
of a plane orbit in a central force field is no longer applicable. In fact 
from the standpoint of the Bohr theory two coexisting 3s elliptic orbits 
in the same atom are very hard to imagine and would certainly violate 
the third Bohr postulate. 

Nevertheless, it does make sense to say that both the valence electrons 
of magnesium in its normal state occupy equivalent 3s states. Speaking 
more generally, it is possible to validate the practice of assigning to each 
of the inner electrons of any atom, as weU as to the valence electrons, a 
pair of quantum numbers n, Z appropriate to a central force field. In 
order to justify these statements we shall turn aside from the main line of 
argument to consider this question of the assignment of quantum numbers 
to the individual electrons from the standpoint of wave mechanics. 



484 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


67b. Perturbation Theory and the Significance of an Assignment of 
Quantum Numbers to Inner Electrons. — In the chapter on perturbation 
theory we have seen that two different eigenvalue-eigenfunction problems 
A and B can frequently be related by means of a continuous series of 
interpolation problems whose Hamiltonian Jff(X) degenerates into that 
of 4 value of X and into that of B for another value. In this 

case the interpolation eigenvalues En(\) and eigenfunctions ^n(:r,X) 
form a continuous bridge between corresponding eigenvalues and eigen- 
functions of A and B, If the problem A has a higher symmetry and a 
higher degree of degenerac^y than the em^rgy levels will split as X 
passes from the value appropriate to yl, say zero, toward the value 
appropriate to B, say unity. Hence each energy level of yl will be 
correlated with a multiplicity of energy levels of B. On the other hand, 
unless B has also a special type of symmetry which is lacking in A 
and in the interpolation problem, there will be no corresponding splitting 
of energy lev(ds as w(* pass from X = 1 toward X == 0. Thus, except for 
accidental degeneracy, which we can ignore, we can say that each energy 
level of the less symmetric problem B is (correlated with a singhc definite 
energy level of If the problem B has degenerate (‘igen values, though 
less degenerate than those of it will be because there are two or more 
independent observables which commute with both Hb and S, S 
must unite with one or more observables O'!, Qca, ’ ' * to form a complete 
set of independent commuting observables. Each simultaneous eigen- 
function of ll and the a's is nondegenerate and forms a perfectly definite 
connection between corresponding eigenfunctions of the j)roblems A 
and B. 

It will now be proved that if wc identify the Hamiltonian of B with 
that of our atomic mod(?l, we can identify A with a parallel problem in 
which each electron moves in a central force field independent of the 
others. The potential energy for a neutral atom with nuclear charge 
Ze is 



+2 

3 3,k 


rjk 


(57-1) 


where ry is the distance of the ^‘th electron from the nucleus, is the 
distance between electrons j and fc, and the second sum is to be extended 
over all pairs of values of j and k [cf, Eq. (32T)]. Let the potential 
energy of the problem A be 

Va = (57-2) 

3 

and let that of the interpolation problem be 

y - + HVb - Va). 


(57-3) 



Sec. 57] THE BOHR ASSIGNMENT OF QUANTUM NUMBERS 485 

In the limiting problem A there is no intcraetion between the electrons 
and each moves independently of the others. Eigenfunctions of this 
problem are obtained by separation of the variabl(\s as follows. Let 

rpu) = I];v,,.(F;.), (57-4) 

3 3 

where each yj/j is an eigenfunction of 

vVy + k(E - Fo(ry))^/ = 0 (57*5) 


with the eigenvalue Then is an eigendunction of the cemtral- 

field Hamiltonian Ha. Thus V yi('lds the d(^sir(‘d connection between 
tlie problem of the atom under consid(‘ration and a related central-field 
problem of the type encountered in Secs. 28 and 29. 

It is, of course, legitimate to assign a definite set of quantum numbers 
to each electron in a suitably chosen eigenfunction of probhmi A. For 
example we may identify ea(*h of the xp/n with one of the solutions of 
(57*5) obtain ('d by separating the variables in spherical coordinates. 
Thus we choose for \p^^^ the expression 

( 57 - 6 ) 

= (^Tt) (57*7) 

It will be convenient to simi:)lify the discussion by eliminating the con- 
tinuous spectrum of both problems A and B through the introduction 
of the boundary condition that \p shall vanish when any electron reaches 
a sphere of very large radius p drawn about the nucleus as a center. 
The modifi(;ation produced in the experimentally available discrete 
eigenvalues and their eigenfunctions will be negligible. Under these 
circumstances a complete set of eigenfunctions of problem A is obtained 
by forming all possible products of the above type. The energy of each 
individual solution is 

j 

Accordingly we have two kinds of degeneracy in problem A, viz., degen- 
eracy due to the fact that the quantum number associated with rij 
and Ij can be chosen in 21, + 1 different ways without affecting the 
energy, and permutation degeneracy due to the fact that an exchange 
of the quantum numbers associated with any pair of electrons does 
not affect the resultant energy. Such an exchange of quantum numbers 
is, of course, equivalent to a permutation of the coordinates of the 
different electrons (cf. Sec. 40d). 

Owing to the degeneracy of the eigenfunctions of problem A, it is 
not true that every eigenfunction is factorable into a product of one- 



486 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


electron central-field wave functions. On the contrary there is an 
infinite variety of linear combinations of product functions possible and 
such linear combinations are not product functions. Hence it is not 
literally correct to say that any eigenfunction of A assigns to each electron 
a private set of centraVfield quantum numbers. In fact one can readily 
prove that just those eigenfunctions of A which form the zero-order 
approximate eigenfunctions of B are not product functions and do not 
give each electron a definite set of individual quantum numbers. 

On the other hand, it follows from (57-8) that every eigenfunction of A 
is a linear combination of product functions in which the same set of 
n,Z values appears in each term. Let us take, for example, the very 
simple case in which there are just two electrons, as in the neutral helium 
atom. Consider the special problem A energy level 

i;u) = + Ei% 

A typical product eigenfunction would be 

== ^i,o,o(ri)^2,i,o(r2). 

The degeneracy with respect to m,- gives two other independent eigen- 
functions, viz.f 

Permutation degeneracy adds three more wave functions obtained by 

interchanging the symbols ri and in the above expressions. In any 
linear combination of these six product functions we have the same pair 
of pairs of n,l values, viz,, 1,0 and 2,1. Thus every eigenfunction of 
and every corresponding zero-order eigenfunction of problem B 
can be^regarded as a description of a mixture of states in each of which one 
electron occupies a Is state, while the other occupies a 2p state. Hence 
it is reasonable to speak of this energy level as one having one Is electron 
and one 2p electron, even though one cannot say which is which. In the 
general case of an atom with any number of electrons one can say that 
the eigenfunctions of problem A and the zero-order eigenfunctions of 
problem B are linear combinations of product functions in each of which 
the same number of electrons is assigned to every pair of values of n and L 
In the higher order approximations to problem B there is a mixing of 
eigenfunctions of A for different unperturbed energy levels. Never- 
theless every energy level of B is correlated with a unique corresponding 
level of A, Hence it is reasonable to ascribe to the levels of B, designa- 
tions which specify the A levels to which they belong. Thus we can 
speak of an atomic-energy level (5 level) as characterized by a definite 
set of pairs of individual electron quantum numbers n,l when what we 



Sec. 57] THE BOHR ASSIGNMENT OF QUANTUM NUMBERS 487 

mean is that the oorresponding level of the modified problem A has 
eigenfunctions involving this set of quantum numbers and no others. 
(It is easy to see that no modification of problem ^4 due to changing 
Fo(r) can alter the rijl values of the A level correlated with any given 
B level.) This nomenclature will be particularly appropriate if it is 
possible to choose the potential function Fo(r) so that problem A really 
bears a close resemblance to problem B. 

Actually it is possible in various ways to pick out forms for Fo(r) 
which make the solution of problem A a useful, though crude, first 
approximation to the solution of B. In fact theoretical studies of atoms 
and atomic spectra are largely made by perturbation methods using a 
problem of the A type as the starting point. Such an initial approxima- 
tion ignores the forces on any one electron due to the specific instantane- 
ous positions of its companions, and substitutes a central force field 
which can be identified roughly with the average field due to the remain- 
ing electrons acting on the one under consideration when the latter is 
located at the given distance from the nucleus. The problem of deter- 
mining the best form for Fo(r) is similar to the problem of determining 
F(r) in Sec. 56a. One can begin with an estimate of the average charge 
distribution in successive spherical shells drawn about the nucleus and 
obtained from the Fermi-Thomas statistical theory,^ or from X-ray 
scattering experiments. If this Charge is thought of as uniformly dis- 
tributed over each shell, the corresponding electrostatic potential function 
(p{r) is readily computed. (p(r) is the potential of Z electrons in the 
case of a neutral atom, whereas what we want is the potential of the 
Z — 1 electrons with which the electron under consideration is inter- 
acting. Hence ip(r) can be multiplied by (Z — 1)/Z, and Fo(r) identified 
with 

V.M . 

As there is an infinite variety of central-field problems correlated 
with the same fundamental B problem, it is possible to make different 
choices of Fo(r) for diffenmt purposes. Suppose, for example, that 
we desire to make computations of the relative energies and transition 
probabilities of the excited states of a sodium atom responsible for the 
optical series spectrum. In this case a natural and simple procedure is to 
choose suitable, fixed, low quantum numbers for the 10 inner electrons 
which belong to the ‘‘core^^ of the atom and give the eleventh electron 
various pairs of quantum numbers representing states of energy equa^ to, 
or higher than, The relative energies of the different states in 

question will then differ as a result of the different energies of the outer 
valence electron onlly, the contributions of the inner electrons being the 

1 Cf. footnote 1, p. 477. 



488 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


same in all cases. The wave functions specified do not take into account 
permutation degeneracy and the Pauli prirn^iple. Nevertheless, in the 
interests of simplicity we here assume our right to compute transition 
probabilities from them. The assumption will be justified in Sec, 646. 
The components of the (ilectric-moment matrix worked out in this way 
depend only on tlu^ wave functions of the valence electron and are 
independent of the factors in the total wave function representing the 
inner electrons. To prove this statement we note that each matrix 
element of, say, the X component of the electric moment, is the sum of 
terms each involving the x coordinate of one electron. In a transition 
of the type under consideration in which only one electron jumps, the 
orthogonality of the two wave functions of the jumping electron elim- 
inates all terms excerpt the one involving the x coordinate of that 
particular electron. This term reduces at once to the matrix element 
computed from the wave functions of the valence electron alone. Thus 
for the special i)urpose under consideration it is unnetressary to introduce 
either the energies or th(j wave functions of the core elcHdrons and we 
can choose Vo(r) solely with the purpose of getting a correct representa- 
tion of the behavior of the valence electron. 

The general type A probk'in for the sodium atom reduces in the above 
case to the single-electron problem of Sec. 566. The latter can also be 
regarded as the result of attempting to secnire an approximate solution 
of the primary type B probhm of the sodium atom in the form of a 
product of a wave function for the valen(‘,e elec.tron and a wave function 
for the 10 core electrons. Its succc^ss is to be attributed largely to the 
lack of serious overlapping of these two functions. 

We conclude that the quantum numbers deduced for the normal state 
of the valence electron from the empirical series spcjctruin and the theory 
of Sec. 56 are, in fact, the same as the quantum numbers to be assigned 
to this electron on the basis of the general A problem scheme just devel- 
oped. Let us now suppose that an atom of atomic number Z is trans- 
formed into an atom of atomic number Z + 1 by first increasing its 
nuclear charge gradually by |c| and then adding an additional electron. 
The question arises whether, or not, the quantum numbers of the inner 
group of Z electrons in the new atom are necessarily the same as those 
of the group of Z electrons in the initial atom. In order to give a partial 
answer we note that the increase in the nuclear charge, which constitutes 
the first stage of the transformation, must inevitably transform each 
energy level of the initial atom into a corresponding level of the positive 
ion of the final atom having the same quantum numbers. The normal 
state of any atom or ion being the state of lowest energy, it is clear that 
the normal state of the atom Z will carry over into the normal state 
of the ion (Z + 1)"*“ if the energetic order of the lowest levels is not 
altered by the increase in nuclear charge. Such a change in the energetic 



Sec. 57 ] THE BOHR ASSIGNMENT OF QUANTUM NUMBERS 


489 


order of the low(\st levels should be a rare phenomenon, although it is 
known to occur at certain points in the periodic table. It is made 
evident, when it does occur, by a comparison of the spectra of the atom 
Z and the ion (Z + 1)+. A shift in the quantum numbers of one or 
more of the inner electrons accompanying the binding of an additional 
outer electron is also not to be ruled out on purely a priori grounds. We 
can say, however, that such a shift should be unusual and must involve a 



Fig. 25. — The Thomsen-Bohr diagram of tho periodic table. Short-life radioactive 
elements of atomic number greater than 92 have been discovered recently by Fermi and 
collal>orators. 


multiple electron jump in the binding process which would be readily 
recognizable from an analysis of the spectrum produced by that process. 
Actually no such shifts are known experimentally. It is accordingly 
legitimate in general to use the optical spectra of the successive elements 
starting with hydrogen as a means of deducing the quantum numbers 
of the inner electrons of the heavier elements. Errors which might 
otherwise creep in by this process may be eliminated by checking the 
first spark spectrum of each element against the normal arc spectrum 
of the element which precedes it in the periodic table. 

Having justified the general procedure begun on pp. 482 to 483, we 
proceed with our discussion of the empirical evidence, turning next to the 
tri valent earth metals and to aluminum, in particular, since it follows 





490 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. Xlll 


sodium and magnesium in the periodic table. Again we find two systems 
of energy levels — ‘doublet and “quadruplets^ this time. The old 
classification of the stationary states is applicable, but the normal state 
of the valence electron for aluminum is not 3s but 3p. Since the 3s 
level should have the lower energy if it existed — due to greater penetra- 
tion into the core — we are driven to the conclusion that we have to do 
with a new exclusion rule. An examination of the spectrum of Al"^ 
shows that the core of normal A1 is similar to the normal state of Mg 
and contains two 3s electrons. The spectra of the other earth metals 
are similar and point to the general rule that there can never be more than 
two electrons in s states of a given atom having the same total quantum 
number. Mention should be made in this connection of the transition 
from helium, with two Is electrons in the normal state, to Li with two 
Is electrons and a 2s electron. Since there are no Ip states, the operation 
of the above rule in this case throws the valence electron of Li into a 
state of higher total quantum number than the inner electrons. 

The spectra of the more electronegative elements which Ix^long in the 
lower portions of the columns in the Thomsen diagram are much more 
complicated and for the present we omit all discussion of their analysis. 
Fortunately this analysis is unnecessary. There is a gradual, though 
somewhat irregular, increase in the ionization potential of the atoms 
from Z = 5 to Z = 10 and from Z ~ 13 to Z = 18, followed in each 
case by an abrupt fall from maxima at the inert gases neon and argon 
to minima at the succeeding alkali metals. Now, an increase of ioniza- 
tion potential should accompany the addition of more electrons to a 
group having a common total quantum number, for the increase in 
nuclear charge would not be wholly compensated by the additional 
repulsive potential of the group. On the other hand, a drop in ionization 
potential may be expected to occur when the normal state of the valence 
electron starts a group of higher total quantum number than any of the 
core electrons. Thus we are led to assume that the electrons added in 
transforming beryllium into neon go into 2p states, while those added 
in transforming magnesium into argon go intc 3p states. 3d electrons 
are excluded from the latter group because the 3p electrons, on account 
of their greater penetration, are much more tightly bound. 

From the fact that the inert gas neon is followed by the alkali sodium 
with a sudden drop in ionization potential we infer that the building up of 
the group of 2p electrons stops when the membership of that group reaches 
the value 6. Similarly the occurrence of the alkali potassium after the 
inert gas argon shows that the maximum number of 3p electrons is 6. 

Without carrying farther this discussion of the empirical evidence for 
the formatioii of successive groups of electrons with higher and higher 
quantum numbers we can proceed to a provisional formulation of the 
Pauli principle. Our analysis shows that it takes two electrons to 



Sec. 68] 


THE ELECTUON-SPIN HYPOTHESIS 


491 


complete a group of s electrons of given total quantum number, but six 
electrons for a group of p electrons of given total quantum number. 
This difference between the sizes of the complete s groups and the 
complete p groups becomes understandable when we remember that the 
p states are degenerate and include three substates each, corresponding 
to the values 0, ± 1 for the magnetic quantum number m. An obvious 
and satisfactory hypothesis is to assume that there can be twOy but only two^ 
electrons in an atom having any given set of single-electron quantum, numbers 
n, ly m. This. statement constitutes an elementary form of the Pauli 
exclusion principle,^ It excludes the crowding of an indefinitely large 
number of electrons into any individual electron level in the limiting 
case of problem A and excludes all the energy levels of problem B cor- 
related with excluded or overcrowded states of A, The principle has 
been substantiated by a mass of supporting evidence which would be 
out of place in this book. We refer the reader to the literature for 
further information. 

In the form given above, the principle assumes our right to assign 
individual m values to the different electrons as we have assigned indi- 
vidual n and I values. Such an assignment cannot be justified by our 
previous argument. Nevertheless we can harmonize the rule with the 
wave-mechanics point of view if we modify its statement to read, All 
energy levels of an atom are excluded whose zero-order approximations 
{problem A) involve products like (57 *6) which assign more than two electrons 
to the same set of quantum numbers n, Z, m. 

68. THE ELECTRON-SPIN HYPOTHESIS 

68a. The Empirical Fine Structure of Spectrum Lines. — In its 

classical form the hypothesis of electron spin assumes that electrons are 
spherical shells of electricity in rotation about their own centers and 
consequently possess an angular momentum about this center and an 
intrinsic magnetic moment. The hypothesis was independently pro- 
posed by several different authors for various reasons, but first took 
definitive form and attracted wide attention when Uhlenbeck and 
Goudsrnit^ showed that it could be used to overcome the difficulties 
inherent in Lancia’s magnetic-core theory of the fine structure of spectrum 
lilies.*^ In adapting the spin postulate to the principles of quantum 

1 W. Paxtli, Jr., Zeits, /. Physik 81 , 765 (1925); Pauling and Goudsmit, Zoc. city 
Chap. IX. 

*G. E. Uhlenbeck and S. Goudsmit, Die Naturwiaaenschaften 13 , 953 (1925); 
Naiure 117 , 264 (1926). The hypothesis was independently proposed by F. R. 
Bichowsky and H. C. Urey, Proc. Nat. Acad. Sci. 12 , 80 (1926). Earlier proponents 
of tho hypothesis include A. H. Compton, Wash. Acad. Set. J. 8, 1 (1918), and E, H. 
Kennard, Phya. Bev. 19 , 420 (1922). 

® A. LandA Verh. d. devJt. physik. GeaeU. 21 , 685 (1919); W. Heisenberg, Zeita.f 
Physik 8, 273 (1922). 



492 ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 

mechanics the model has been replaced by a system of equations which 
bear a close relation to the model but which cannot be derived from it 
in any rigorous sense. Heisenberg and Jordan^ first treated the spin 
properties of the electron by the matrix method and later Pauli^ gave a 
provisional formulation in terms of wave mechanics. A very great 
advance was made when Dirac* showed that a more rigorous wave- 
mechanical treatment can be derived as a by-product of an attempt to 
reconcile the quantum theory with special relativity theory. Dirao/s 
method of approach dispenses entirely with all assistance from the 
classical model of Uhlenbcck and Goudsmit and is now generally con- 
ceded to be the most fundamental way to introduce the subject. It is 
relatively abstract, however, and leads ultimately to great difficulties 
which have not been fully overcome as yet.^ Since the Pauli spin theory 
is easier to apply than Dirac’s, and yet suffici(uitly accurate for most 
purposes, it seems wise to give this elementary formulation here, loading 
up to it by the historical-empiricial approach. We begin with a brief 
description of the fine structure of optical energy-level systems.* 

The optical energy-level spectra of the alkaline-earth mc'tals show an 
empirical structure similar to that which we have found in the case of 
the alkalies, but more complicated, inasmuch as there are two sharp 
series, two principal series, etc. In fact we find two complete arrays of 
energy levels, each of which, fine structure excepted, is a replic.a of the 
energy-level array of one of the alkalies. Transitions between one 
major level, i.c., one group of fine-structure components, and another 
of the same array are governed by the selection rule Al ~ ±1. Transi- 
tions between levels of the different arrays occur much less freely than 
those between levels of the same array, especially in the spectra of the 
lighter elements of this type. One of th(is(‘ arrays (singlet) shows no 
fine structure, but, outside the sha *p series, whose members are single, 
the levels of the other array (triplet) have three closely spaced com- 
ponents each. 

All energy levels which can b<^ given a place in such an array can be 
accounted for in first approximation by means of a single-electron model. 
Not all major energy levels of the alkaline-earth elements, however, can 

1 W. Heisenberg and P, Jordan, Zetts. /. Physik 37, 263 (1926). 

* W. Patjli, Jr., Zeits, /. Physik 43, 601 (1927). 

®P. A. M. Dirac, Proc. Roy. Soc. A117, 610, A118, 351 (1928). Cf. also Dirac 
P.Q.M., Chap. 13; C. G. Darwin, Proc. Roy. Soc. A118, 654 (1928). 

* Cf. W. Pauli, in Geiger and ScheePs Handbuch der Physik^ 2d ed., vol. 24, Part I, 
pp. 242-247 (1933); also W. H. Furry and J. R. Oppenheimer, Phys. Rev. 46, 245 
(1933). 

* Here we refer to the coarser type of fine structure which originates in the spin 
of the extranuclear electrons. There is an additional so-called ^‘hyperfine structure 
due to interaction between the outer electrons and the spin of the nucleus concerning 
which we shall have nothing to say. 



Sec. 58.1 


THE ELECTRON-SPIN HYPOTHESIS 


m 


be fitted into a scheme of classification based on assignments of a quantum 
number I which conforms to the principle of selection Al ±1. More- 
over, although structure of this type is to be found in the energy-level 
spectra of elements with more than twx) vahuice electrons, its incomplete- 
ness becomes more evident as the valen(*e increases. On the other hand 
it is possible to include many additional energy lev(‘ls in a classification 
of the major levels which assigns to each an I value in such fashion as to 
preserve the selection principle Al = 0, ±1 and to divide the levels of 
most atoms into systems analogous to the singlet and trii)let arrays 
describ(»d above and each having its own characteristic, fine structure. 

The argument of Sec. 40c indic^ates that the square of the rt'sultant 
angular momentum should be an integral of the motion of a free 
atomic system composed of point electrons and a point iiuch'us. Further- 
more, £2 should b(' a function of II so that each energy level is char- 
acterized by a definite value of the angular-momentum quantum number 
L. According to the Bohr theory radiative transitions bc^tween states 
of a many-particle ])roblem with diff(‘rent angular momenta should b(^ 
governed by the principle' of selection AL = 0, ±1. The same rule 
will be derived on a quantum-me^chanical basis in Sec. 64. Since this 
selection rule is broader in sc.ope^ than the rule for oiu'-electron systems, 
we conclude that the empirical quantum number which changes by 0, ± 1 
is to be identified with the (|uantum number L of the resultant angular 
momentum of all the electrons. From this point on we reserve the 
letters s, p, ri, /, • • • , introduced on p. 475 for the d(^signation of the 
states of a single (electron moving in a central force field, and introduce 
capital letters for states of a complete atom having difTerent angular 
momenta. The correlation between the letters and the L values is 
indicated in Table I, p. 494. 

All the major energy levels with a given value of L and belonging to a 
common system have the same number of fine-structure components. 
Within any system the number of such components iiuTcases in steps of 
one or two as we pass from one value of L to the next larger one until a 
maximum value is reached, after which the number of sublevels remains 
constant. This maximum multiplicity is the fundamental character- 
istic of the system, which is called ^‘singlet,” ‘‘doublet,'^ ^‘triplet,'^ 
‘‘quadinplet,” etc., according as its value is 1, 2, 3, 4, • • • . Table I 
shows in detail the number of components for the major levels of all 
types in systems with multiplicity up to 4. 

Space does not permit a discussion of the methods by which empirical 
spectroscopists have been able to unravel these structures from the maze 
of lines which make up the spectrum from which they start. Suffice it to 
note here that the relative weakness of intersystem combination lines 
and the characteristic Zeeman patterns of different types of line are 
important aids. 



494 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


Table I. — Multiplicities of Different Classes of Energy Levels 


Symbol 

S 

P 

D 

F 

G 

H 

L 

0 

1 

2 

3 

4 

6 

Singlet system 

1 

1 

1 1 

1 

1 

1 

Doublet system 

1 

2 i 

2 

2 

2 

2 

Triplet svstem 

1 

3 

3 

3 

3 

3 

Quad ruplct system 

1 

3 

4 

4 

4 

4 


It will bo observed that the number of components is always equal to 
2L + 1 unless that number exceeds the multiplicity of the system. This 
rule is confirmed by the study of systems of still higher multiplicity. 
The number 2L + 1 is equal to the number of eigenvalues consistent 
with the eigenvalue L(L + l)h^/4ir^ for f.e., to the number of sublevels 
into which a major level of angular-momentum quantum number L 
should be split by a magnetic field according to the elementary spin-free 
theory of the Zeeman effect given in Sec. 49c. In all cases the spacing 
of the fine-structure components decreases rapidly with increasing values 
of L and of the ordinal number in a Rydberg series. 

The fine structure of the energy-level system is, of course, derived 
from the corresjionding fine structure of the spectrum lines. The latter 
also rev(ials the existence of a new principle of selection affecting transi- 
tions between different sublevels of any pair of major levels. This rule 
takes the simple form A/ = 0, ±1 if we assign appropriate values of a 
new quantum number J, called the inner quantum number. A suitable 
scheme for the assignment of J values, which gives the correct number of 
components for each major level, is contained in the following rule. 

Let 25+1 denote the multiplicity of the system to which the given 
major level belongs. The maximun. %)alue of J for any of the components is 
L + S and the minimum value \L — 8\ The remaining values arc spaced 
at equal unit intervals between these extremes. Consider, for example, the 
case of a jP level in the quadruplet system. Here S = % and L = 1. 
Th(; J values are 1)^, K* AD level of the same system will have 
four components for which J takes on the values 33^^, 2^, 1^, 
respectively. The so-called ‘‘normal^’ order of the energy values is 
that of increasing J and is the usual order for the simple spectra of 
the electropositive elements. It is more or less systematically inverted 
in the case of the electronegative elements. 

It is evident at once that the above normal assignment of J values is 
arbitrary in that we could increase all the values in the spectrum of any 
element by any amount Jo without affecting the form of the selection 
principle. According to the normal assignment the J values are integral 
in the case of systems of odd multiplicity and half-integral in the case 
of systems having , even multiplicity. It is a fundamental and very 





Sec.. 581 


THE ELECTRON-SPIN HYPOTHESIS 


496 


important empirical rule that systems of even multiplicity only occur 
in atoms or ions having an odd number of extranuclear electrons^ while 
systems of odd multiplicity only occur in atoms or ions with an even number 
of extranuclear electrons. Hence it would be possible to eliminate the 
half-integral J values entirely by introducing a Jo which has a half- 
integral value for atoms and ions with an odd number of extranuclear 
electrons, and an integral value for those which have an even number of 
extranuclear electrons. Such an arbitrary procedure would partially 
spoil the symmetry of the normal assignment of J values, however, and 
does not lend itself to the developnumt of the theory. 

68b. The Combination of Angular Momenta.^ — The existence of any 
fine structure in the spectra of the alkali atoms means that the single- 
electron model is imperfect, and that, if it is to be retained at all, the 
central-force-field hypothesis must be modified. Classically such a 
modification would imply that the optical electron is subject to a torque 
with respect to the nucieus as a center. In the case of a fr(‘e atom such a 
torque would imply a countertorque and hence a se(^ond angular momen- 
tum in the atom. The natural inference would be that the second angular 
momentum comes from the core electrons. Of course the coupling 
between the optical electron and the core electrons would liave to be 
weak to account for the fact that the single-(']ectron model works so 
well. Since the principle of selection for the inner quantum number J 
is the same as that to be expected for th(^ quantum number of the square 
of the resultant angular momentum, we are led to identify J with that 
quantum number and Z, or L, with the quantum number of the square 
of the angular momentum of the optical electron, or the optical group 
of electrons, as the case may be. In the case of S states the angular 
momentum of the optical electron, or electrons, is, on this hypothesis, 
zero. According to the normal assignment the (corresponding value of J 
is the quantum number S which fixes the multiplicity of tin* system. We 
are thus led to identify the square of tlie core angular momentum with 
S(S + l)hy^ 7 r\ 

Let us now examine the number of fine-structure components of tlie 
various major energy levels predicted by this provisional theory. To 
this end we consider the coupling of two sets of electrons revolving about a 
common nucleus. To avoid prejudice and confusion we introduce nc^w 
symbols for the quantum numbers in this calculation. Let Li and L2 
denote the angular-momentum quantum numbers for the two individual 
sets, while L denotes the quantum number of the resultant angular 
momentum. The corresponding magnetic quantum numbers will be 
indicated by Afi, M2, and M, respectively. We refer to the uncoupled 
state of the system, in which the interaction terms of the Hamiltonian 
are neglected or replaced by centrally symmetric averages, as case A, 

1 The method here used is due to Slater [Phys. Rev 34, 1293 (1929)], 



496 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


and to the coupled state, as case B. For case A one can use a system 
of wave functions which quantize the square of the total angular momen- 
tum of each group and the z component of the angular momentum of each 
group. The matrices of £2^, and are then diagonal. We 
can also use a system of wave functions which quantizes the square of the 
resultant angular momentum of the combination of the two groups and 
the z component of that momentum. The latter system of wave fun(*- 
tions is adapted to the description of the coupled state, case B, in which 
£2^, JE12, are no longer integrals of the motion, but the former 
description is the one which gives meaning to the assignment of separate 
quantum numbers to the two groups. 

Let us now consider a case A (^nergy level with the individual quantum 
numbers Li and Z/2. Ignoring permutation degeneracy — this procedure 
will be justified lat(^r on — we note that the level in question has 

(2L, + 1)(2L2 + 1) 

linearly indeptmdent wave functions obtained by multiplying the wave 
function of the first group for each value of Mi by th(* wavv function of 
the second group for (^a(di value of M 2 . A suitable canonical trans- 
formation will carry these wave functions over into a new set, equal in 
number, which are (ugenfunctions of the angular momenta and of 
the combination. As a matter of fact each of th(^ })roduct functions is 
already an eigenfunction of £5 with the eigenvalue M = Mi + M 2 . 
Hence the c.anonical transformation will form linear combinations of 
those initial functions only which have a common value of Mi -f M 2 . 
We are thus led to construct a diagram of the type shown in Table II 
whi(di illustrates the special case L\ = 2; L2 = 1. The Mi values are 

Table 11. — Values of Mi + M 2 fok Case Where Li = 2; La = 1. 





Ml 




-2 

-1 

0 

+1 

+2 


-3 

-2 

-1 

0 

+l 

M., ^ 0 

1~2 

-1 


+1 

+2 

f -fl 

-If 

0 

”+ l ' 

+2 

“+ 3 ' 


laid off horizontally, the M2 values vertically, each square indicating a 
possible product function for case A. In these squares we have inserted 
the corresponding values of M. In this particular case we shall have one 
product function fpr M = ±3; two for M = ±2; three for M = 0, ±1. 
In passing from the case A product functions to the zero-order case B 
functions we have to ^‘scramble’’ the different product functions belong- 
ing to each M value. As a result we obtain the same number of case B 



Sec. 58] 


THE ELECTRON-SPIN HYPOTHESIS 


497 


functions for each M value that w(‘ had ladore. Tims (\a(*h square 
of Table II may b(^ considered as nqu-esentin^z; a case B function with the 
givcTi value of M. 

Knowing the nuinlx^r of substat(‘s for each M value in case B we can 
readily determine the' resulting eigenvalues of or tlu^ value's of the 
quantum numlx'r L, Since 3 is the largest value of M in the* group 
repre'sented by th(‘ table, it follows from th(^ discussion in Sec. 40/ 
that the states for wliich M == +3 must liave the quantum number 
L = 3. But we know from this same discaission that if an energy 
level of a fn'e atomic syst(*m has any eigenfunctions which are also eigen- 
functions of 43“ with the eigenvalue L(L -f- l)hr/4T\ it must hav(' 21 j + 1 
such functions with the ^‘magmdied’ quantum num})ers M = 0, ±1, 
±2, • • • , ±L. Th(‘ construction of sucli an L comph'x will us(‘ up all 
the case B functions on th(^ top row and riglit-hand column of the array 
sliown in Table 11. Th('re will then bc' just one remaining (*ase B function 
for ('ach of th(' M valiu's, ±2. ConscxpHuitly we must ]iav(^ an L complex 
for L = 2 as well as L ~~ 3. This will us(^ the n'lnaiiuhT of the sc'cond 
row of the figure and oiu' fuiudion from the third row. Finally the nunain- 
ing functions in the last row are just suffici(‘nt to form an L complex 
with th(' eigenvalue /> = 1. 

In general it will b(^ evident that tin* rnimber of L (x>mj)lexes which 
ar(‘ foriiK'd by a canonical transformation of the (2Li + l)( 2 L 5 i + 1) 
]:)roduct functions of an case A (‘lu'rgy level is (xjual to the number 

of rows in a diagram lik(' 'Fable II arrang(‘d with tin* long way of th(' 
diagram horizontal. In other words it is (‘qual to thc‘ smaller of the tw^o 
quantities {2Li + 1), (27^2 + 1). 

Th(' application of tJu^ jM'rturbing Hamiltonian which transforms the 
case A ])roblem into th(' cavse B probk'in cannot break up any of the 
L (‘omph'xes since, by the iFeory of S('(*. 40/, they an* indissoluble so 
long as there is no extc'rnal t,orqu(* af)plied to the sysUnn as a whole. 
On the oth(‘r hand, since this pcTturbation brings in Coulomb forc(\s 
between individual electrons of the Uxo different sets, it must eliminate', 
the* degeneracy with resja'ct to 4^!,.. and 432.-, thus producing ('nergy differ- 
en(r(\s b(dw('en the' difft'n'iit L complexes. From tlu^ standpoint of 
Sec. 40c this splitting of the impc'rturbed energy levels is to be attributed 
to the fact that the perturbing term of the Hamiltemian does not commute 
with the angular-momentum vectors of the twx) j)arts of the system. 
Hem^e then^ remain no operators which commute wdth H w’^hich do not 
also commute with 43^. This mak(*s 43“' a function of H (except for 
eventual accidental and continuous-spectrum degeneracy) and allows 
only one value of L for any oiu^ energy level. We conclude that a case A 
energy level characterized by the quantum numbers L^y L 2 should bo 
split by the interaction into 2L^ + 1 components, where Li is taken to be 
the smaller of the two numbers Li, L 2 . 



498 


ATOMIC STRUCTURE AND ELECTRON SPIN IChap. XIII 


A less rigorous, but perhaps more appealing, argument for this 
splitting is the following. Since each of the zero-order case B wave 
functions is a simultaneous eigtmfunction of £ 2 ^, the relation 

£2 = + 2£i ' £2 (58*1) 

demands that two such eigenfunctions belonging to different L complexes 
must be eigenfunctions of 


£1 • £2 — £laj£ 2 * “h £ii/£ 2 j/ H“ £l2£2« 

with different eigenvalues. Classically the energy could not be inde- 

pendent of £1 • £2 unless the mutual energy of the two groups of particles 
was independent of their orientation. But if the two groups have 
nonvanishing angular momenta they cannot have spherical symmetry, 
and their mutual energy cannot be independent of their orientation. 
Hence we infer that the average energy of zero-order case B eigenfunctions 
belonging to different L complexes must be different. 

68c. The Lande Magnetic Core Theory. — Turning back to the 
problem of the actual fine structure of spectrum lines, we are led to 
correlate the quantum numbers L, Li, L 2 of the preceding theory with 
the empirical quantum numbers J, L, S, respectively, of Sec. 58a. In 
other words we postulate that the atom can be resolved into two weakly 

coupled parts with individual angular-momentum vectors £ and S, 

whose resultant £ -f- S we shall call In applying the term ‘‘angular 
momentum’^ to these vectors we imply that they are subject to the 
commutation rules (38T5), (38*16), and (38*17). It follows from Sec. 40/ 
that the associated eigenvalues are given by expressions of the form 


(£»)' = L(L + 1)^, 
(S*)' = S(S + 1)^. 
= J{J + 1)^,, 


(£.)' = M.A, 


(S.)' = 


(cJ«)^ ~ (£a “1" ol«) 

II 


(58*2) 


The interpretation of £ and S as ordinary orbital angular momenta 
defined by (34*10) requires that the quantum numbers L, Ml, S, Ms, J, 
M shall all be integers, but the general theory of Sec. 40/ admits the 
possibility of odd multiples of 3^^. According to the above hypothesis 
the number of sublevels due to a weak coupling of these vectors should 



THE ELECTRON-SPIN HYPOTHESIS 


Sec. 68] 


499 


be 2L + 1 if ^ and 2(S + 1 if <S ^ L — a result in agreement with 
observation.^ 

Interpreting £ as the angular momentum of the exterior (valence) 
electron or electrons active in producing the optical spectrum, we are 

— ♦ 

confronted with the problem of finding a physical interpretation for S. 

The first and obvious suggestion that S is the orbital angular momentum 
of the core electrons came from Land6. He observed that the existence 
of a core angular momentum implied the existence of a corresponding 
core magnetic moment whose field would apply the necessary torque to 
the valence electron. In first ap])roximation the interaction energy 
of the two angular momenta was taken to be the negative scalar product 
of the magnetic moiYKint of the core and the average magnetic field due 
to the valence electron at the niu^leiis. The absolute value of the width 
of the alkali doublets calculated on this basis is of the right order of 
magnitude, but the Lande theory involved several serious difficulties of 
which thrtic will be mentioned here. 

One of these difficulties was the inversion of the energetic order of 
the fine-structure levels in normal multiplcts. According to the Land^ 
theory a parallel orientation of the two angular momenta should give 
the least energy and, of (nurse, the largest J value. The opposite 
energetic order is the normal one, (unpirically. A sec^ond source of 
embarrassment lay in the occurrence of half-integral values of S and J 
for elements of odd atomic; number — an empirical feature (equally incom- 
prehensible from the standpoint of tlie Bohr' theory and of the wave 
mechanics, so long as one uses the. Land6 model. The third and crucial 
difficulty lay in the (;xisten(;e of a discrei)ancy between the angular 
momentum of the normal state of each atom and that of the (nre of the 
succeeding element of the ixniodic table. The inert gases are all dia- 
magnetic and we should accordingly (;x])e(;t the normal state of each (jf 
them to have zero magnetic moment and zero angular momentum. 
Thus the J value of the normal state of neon and the S value of the sodium 
atom should both be zero. Actually we find that the latter is ^ 2- Again, 
the J value of the normal state of sodium being ]4f should expect the 
S value of magnesium to be the same. Actually there are two systems of 
series in the magnesium energy-level scheme, one singlet (S = 0), the 

^ It is worthy of note that the same result is obtained if one asks for the number 
of different orientations of a vector of length L with respect to a vector of length S 
which makes the hmgth of their resultant an integer or a half integer, according as 
the sum L -f- S is assumed to be an integer or a half integer. In the Bohr theory L 
and S were interpreted directly as the measures of the absolute values of the corre- 
sponding angular momenta. Thus a simple vector diagram gave tlie result regarding 
the interaction of angular momenta which we have derived somewhat laboriously 
above. 



500 


ATOMIC STRUCTURE AND ELECTRON SPIN [CiiAr. XIII 


other triplet (S = 1), It is not possible to “ doctor the J value assign- 
ments so as to eliminate these discrepancies. 

68d. Solution of the Fine-structure Problem by the Electron-spin 
Hypothesis. — The last mentioned discrepancy in the theory suggested 
to Uhlenbeck and Goudsmit^ the possibility that tlu^ ele(‘trons have an 
angular momentum due to rotation about an internal axis- ^b3lectron 
spin — in addition to their orbital angular momentum. On this liypoth- 
esis the fine structure of the si)e(itra of sodium and the other alkalies is 
interpreted as due to the interaction between the orbital and spin angular 
momenta of the valence electron alone, th(3 cor(3 having zero angular 
momentum and playing only a minor passive ])art in the problem. 
Since all angular momenta which need be consider(‘d b(^long to a single 
electron, we shall use the small hdters /, s, j, m, instead of L, S, 

Jj Mhy My for their (piantum numbers. In order to account for the 
doublet fine structure w(3 have only to identify the value S = a = lo 
characteristic of sodium with th(3 (piaiitum mimfxn’ which determines the 
resultant electron-s])in angular momentum. In other words w(3 assume 
that the square of the spin angular momentum has the singl(3 eigenvalu(‘ 
/ + l)h^/i7r^. Assuming further that the spin angular momentum 

combines with other angular momenta just as tw'o orbital angular 
momenta combine, we irder that tlu^ j values of the resultant of the spin 
and orbital angular momenta should be I + }'2 and I — ^4 provid(?d that 
I > 0. If Z == 0, j reduces to the sjan (luantum number H. This 
ch(3eks with the empirical r(3sults for sodium and th(3 other alkalies. 

In the case of magnesium and the other alkaline ('arths, the spins 
of the two valence (electrons must be assuim^d to combine first with each 
other to form a resultant spin angular momentum with (quantum numbers 
0 or 1. The observed fine structure is then interpreted as due to tlie 
combination of the orbital angular momentum of the optical electron 
with the resultant spin angular momentum. 

Classical theory requires that the spin angular momentum shall be 
accompanied by a corresponding spin magnetic moment. The ratio 
of the magnetic moment of a spherical spinning electron to the corre- 
sponding angular momentum had been calculated before the promulgation 
of the Uhlenbeck and Goudsmit hypothesis by Abrahacft, wdio found it 
to be e/nCy or twice the value of the same ratio for the orbital motion of an 
electron. (The formula is algebraic: the negative charge e goes with a 
magnetic moment opposite in direction to the spin angular momentum.) 
Hence the spin axis should precess about the lines of an external magnetic 
field with a double-normal Larmor precession frequency. 

If the spin angular momentum of an electron is free to orient itself 
in any direction, and is subjected to the influence of an external magnetic 
field, we should expect the component of that momentum in the direction 

^Loc. cit^y footnote 2, p. 491. 



Sec. 58] 


THE ELECTRON-^SPIN HYPOTHESIS 


501 


of the field, say .S,, to Ix^ quantized just as the eoiiij)oneiit of the orbital 
anp;ular momentum paralh'l to the field is ciuantiz(d in the elementary 
tlieory of tbe Zefunan effect. Tlu^ eigenvalue's of compatible with the 
resultant spin angular momentum (8-)' = 3 2 (.^2 + l)/?“/47r^ are evi- 
h 

dently ±^’ Classically the mutual (*iiergy of the field and spin would 

be equal to the lu'gative scalar product of the magnetic moment and the 

field strength CH!.. The eigc'nvalues of tlu' mutual ('lu'rgy are accordingly 

± - -7 where ^ — |,TC|. In the shan)-serios (‘luagy levels of sodium 

the electron s})in is uncoupled and we should accordingly expect tlu^se 
levels to bo split in an external magiK'tic fi('Id into two componc'iits 
symmetrically })lac(^d with r<'S})ect to the uni)('rtur])ed line and liavung 
the ern^rgy diff(‘rence c,U7?/27rgr, ?\c., twi(*(‘ th(' s])acing of the magnetic 
su])levels in the simple Zcx'inan ('ff(‘c.t of S(‘c. 49c. An em])iri(*al analysis 
of th(' comj)lex ZiM'inaJi ('ff‘('ct for sodium rc'vc'als the exjx'cted splitting 
of the shar})-series ('lu'rgy h'vels and tlius (*h(‘cks the assumed ratio of 
the magiH'tic moiiK'nt to the sj)in angular monu'ntum, as well as our 
general ju'oeedun'. A ^'ery important additional clx'ck coiik's from the 
measiirenKUits of the so-calh'd ^^gyromagnetic ratio in ferromagnetic 
solids.^ 

We hav(‘ rx'xt to (‘xainiin' the nature of the interaction between the 
spin angular momentum and th(‘ orbital angular momentum of an 
electron. A pro]x*r tn'atment of this interaction yit'lds a quantitative' 
exjdanation of tlu' fine structure of the alkali-metal spectra and th(‘, basis 
for a theory of tlx' fiiu? structure of the hydrogen-atom si)ectrum. Tlu^ 
existemce of any interactiim implies a tonpie exerted on, the spin due to 
the motion of the elec'tron through the ek'ctric fuUd of the nucleus and a 
countertorque about the niK^hnis as a center and tending to change the 
nuch^ar angular momentum. The reason for tlx^se torques becomes 
evident wlu'ii we recollect that an elec^trostatic field can b(^ transformed 
into an electromagnc^tic one by the application of a Lonaitz transforma- 
tion. In a frame of reference in which the electron is momentarily at 

rest and the nucleus moving with a velocity v the electron is subj('c4 to a 
magnetic field 


5C„ 


Z|c| r X V 

c ^ 


where r is the vector distance from the electron to the nucleus. The 
electron in this field will have a classical energy 


1 C/. S. J. Barnett, Rev. Mod. Phys. 7, 129 (1935). 



502 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


IJ/ = ils • 3C„ = • [^X 

/zc /zc-\rV 

due to its spin. If we suppose this energy to be independent of the frame 

of reference and note that X v is equal to the orbital angular momen- 
tum in that frame of reference in which the nucleus is at rest, we obtain 
the classical expression 

— » — * 

lU . Zc‘^. (58-3) 

for the mutual energy of the two angular momenta. 

A more careful relativistic analysis shows, however, that the above 
expression should properly be multiplied by — the famous Thomas 
correction factor.^ Of the various discussions of this factor which have 
been published, that by Kramers is perhaps the most luminous. He 
shows that if the torque applied to the spin angular momentum of an 

electron at rest in a magnetic field is (c/mc)S X tTC the corresponding 

torque for an ele(;tron moving witli velocity v is given by the equation 

^.^|VxS + l|x|8x;i}+?, (58-4) 

in which r is the proper time, and T is defined by 

T = S(7 • S) - i8(7 • S) - i7(S • S)|- • 

If the electron is in not too rapid motion about a center of force, we can 

replace the electric force 8 by (g/e) dv/dr and thereby reduce T to the 

iL. H. Thomas, Nature 117, 514 (1926), Phil. Mag. 3, 3 (1927); J. Frenkel, 
Zeits. f. Physik 87, 243 (1926); H. A. Kramers, Physica 1, 825 (1934). 

Recent letters to the editor of the Physical Review by Inglis and by Dancoff and 
Inglis [Phys. Rev. 50, 783, 784 (1936)] bring out an important point regarding the 
Thomas correction which is somewhat obscured from the dynamical point of view of 
Kramers. The corrected spin-orbit interaction energy Us of Eq. (58*6) is shown to 
be the sum of the ** magnetic^’ energy term Us' of Eq. (58-3) and a negative kinematic 
form of relativistic origin which has the form 

Us" is equal to —K in the case of an electron in orbital motion in a Coulomb field. 
On the other hand in atomic nuclei, forces of a nonelectric character predominate and 
Us" is greater in absolute value than C/s'. This causes a reversal of the sign of the 
spin-orbit interaction for nuclei. 



Sec. 59] 


FINE STRUCTURE OF ALKALI ATOMS 


503 


form 


2c^ dr 


{Vx [S X v]\. 


For motion in a closed orbit T would average 


to zero. The secular changes in S are then given by (58*4) with the 
term T omitted. The average torque for the time of one revolution is 


thus perpendicular to S and the secular changes alter its direction but 
not its magnitude. The corresponding energy for an electron in orbital 
motion through an electrostatic central force field with potential 4> is 


Us = 


rS * [8 X v] 


2/i-c\ r dr ) 


oS ■ = 


1 


(I dV) 

2,ji^c\r dr) 


S • JU. (58-5) 


In the case of a Coulomb field with charge —Zc at the center of force the 
expression for the energy reduces to 




Ze- S * ^ 


(58-6) 


or half the value given by (58*3). lilxcept for the difference in sign 
already noted, this final classical formula is similar in form to the (»nergy 


^2 S • 

expression used by Landd, which also involves the factor when 

properly reduced, but not the factor Z. It will be observed that (58*6) 


makes the energy greatest for parallel oricnitations of S and £, so that 
it has the correct sign and eliminates the first of the difficulties in the 
Land^^ theory mentioned above. 


69. THE FINE STRUCTURE OF THE SPECTRA OF ATOMIC SYSTEMS WITH 
A SINGLE VALENCE ELECTRON 

The work of Sec. 58 giv(‘s the electron-spin hypothesis fairly definiU? 
form and indicates a decided superiority over the Land6 magnetic-core 
theory as an explanation of the fine structure of atomic spectra. We 
proceed to the crucial t(\st of a quantitative calculation of the spacing 
of the fine-structure levels in the spectra of the alkalies and of hydrogen. 
Incidentally we shall derive the formulas appropriate to positive ions 
isoelectronic to one or the other of the above types of atoms. 

Let us begin with a first-order approximation to the energy correction 
for the spin-orbit interaction in hydrogen and the hydrogenic states of 
the alkalies. For this purpose we need to compute the mean value of Us 
for the zero-order case B wave functions of Sec. 586 or, what amounts to 
the same thing, the diagonal terms of the matrix of in a scheme of 
wave functions which makes the unperturbed energy, the square of the 
orbital angular momentum, that of the spin angular momentum, and 
that of the resultant angular momentum, all diagonal. We reserve for 
later discussion the question of the exact nature of the wave functions 



504 


ATOMIC STRUCTURE AND ELECTRON SPIN IChaf. XIII 


to be used in tlu? study of eleetron spin and proceed on the assumption 
that from a matrix point of view the interaction betw(»en the spin angular 
momentum and the orbital angular momenlum can be treated in the 
same way as that of two orbital angular momenta. 

Let Ir, S, J denote respec^tively the matrices of <£, S, and g. We know 

from Sec. 585 that when and are diagonal, Li • L 2 is diagonal. 

Similarly, when L^y and 

= (I. + S.r + (L, + + (I. + S.r (59-1) 

are diagonal, tlu^ scalar product S • L must Ix" diagonal. In fa(‘t (59*1) 
leads directly to the relation [cf. (5S-1)J 

J- = L^ + S‘^ + 2S" ' I, (59-2) 

if we assume that the compoiuaits of L commute with those of S. From 

. . V 1 ^ 

(59*2) we can compute the (‘ig(»n values of S • L. Thus, using the quantum 
numbers of 5- and as matrix indices and sui)pressing the other indices 
required for a (K)mi)lete notation, we hav(^ (cf. p. 366) 

== + 1) + 1) “ * (59*3) 

The possible values of j are / + 32 ^ ^ Hence the two (‘igen- 

values 

il + } 2,118 -211 + M,l) = 

{I - 1 ) 1 ^ 8 -111 - yi,i) = 

are obtained. 

Equation (58*6) shows that the matrix JJs is a multiple of the product 

of S • £ into the matrix of 1/r^. Since the former matrix is diagonal, 
the diagonal elements of Us which give the first-order energy corrections 

are multiples of products of the eigenvalues of 8 • £ into diagonal ele- 
ments of the matrix of l/r'\ Le., into the average values of 1/r® for the 
appropriate orbital wave functions. 

We proceed to a calculation of 1/r® for a state with total quantum 
number n and azimuthal quantum number 1. As the radial wave func- 
tions are independent of the magnetic quantum number m/, the mean 
values of 1/r® are also independent of This is fortunate as the fni 
values have been scrambled in forming the zero-order case B functions. 


(59*4) 



Sec. 59] 


FINE STRUCTURE OF ALKALI ATOMS 


605 


Tho calculation of 1/r® for hydrogonic states was first carried through 
on the basis of the Bohr theory.^ A wave-mechanical evaluation of this 
quantity has been worked out by Waller^ using a somewhat laborious 
direct evaluation of the integrals j(S\ni{ryr~HT. The following discussion, 
however, is an adaptation of an ingenious matrix treatment due to 
Heisenberg and Jordan.* The procedure involves two steps of which the 
first is the evaluation of l/r* in terms of \/fK 

As a starting point we choose the scheme of wave functions 


introduced in Sec. 28 and based on the use of sin 6 drdOdtp as a 

normalizing integral. From the corresponding radial equation (28T9) 
we see that the Hamiltonian operator appropriate to this type of wave 

function for the unperturbed problem, with £ and S uncoupled, is 


87rV dr^ 2ixr'^ r 


(59-5) 


where is to be given the form of Eq. (34T8). 

h ^ 

Denoting the Hermitian operator 2 ~* by let us consider the 
Poisson bracket 





(59.6) 


It follows from the fundam(‘ntal formula (38*6) that the mean value of 
this operator for any state is the time rate of change of the mean value 
of pr for that statt'-. In the case of a stationary state the mean value of 
pr must l)e constant and that of [HajPr] must be zero. If the state in 
question is an eigenstate for we have 



¥) 


Xnim, sin B drdedtp = 


47rV \»’v 



= 0. 


Thus* 


\rV hH(l +i)\ry 


(59-7) 


1 Cy. M. Born, Aiommechanik^ §22. 

2 Waller, Zeits. /. Physik 38, 635 (1926). 

8 W. Heisenberg and P. Jordan, Zeiis. /. Physik 37, 263 (1926). A more general 
matrix method for calculating the mean values of negative powers of r has been given 
by Van Vleck, Proc. Roy. Soc. A 143, 679 (1934). The derivation of (59- 12) below 
also follows a procedure due to Van Vleck, Quantum Principles and Line Spectra^ 
p. 299, 1926. 

^ A similar examination of the mean value of [JEf,rpr] for an arbitrary central force 
field yields 



606 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


The eigenvalues of the Hamiltonian of Eq. (59 5) are those of the 
radial equation 


viz., 


H„(K{r) 




(59-8) 


E„ = 


NhcZ^ 


-r fr 

To evaluate 1/r^ we introduce the modified Hamiltonian 


"" 24_^' ^ J • r 


Tile corresponding radial equation is the same as (29T) except for the 
substitution of Z + € for L It is not difficult to see that this equation 
can still be solved by the polynomial method of Sec. 29 and that It yields 
the eigenvalues 


p f \ _ NheZ'^ __ EnTl^ 

(j, + / + , 4- 1)2 In + «)2- 

Hence 

' dgn(e )1 ^ _2^ ^ 2NhcZ^ 

(k J,_o n n® 


(59-9) 

(5910) 


Elementary perturbation theory can also be used to evaluate E„(e) for 
small values of e and thus d.ff„(e)/rf« for the point e = 0. We obtain 



8irV 


(2f + 1) 



(59- 11) 


Equating the right-hand members of Eqs. (59T0) and (59 11) gives 


Let EsirijlJ) denote the mean value of Us for the state whose quantum 
numbers are I, j. Let a denote the dimensionless fine-structure con- 

stant Then Eqs. (58-6), (59-4), (59-7), and (59-12) yield 


2E - rF^ - 2TT?7 = 0, 
or, if T denotes the kinetic energy, 

2f = rV^{r). 

This is the appropriate specialization of the virial theorem. For the special case of an 
hydrogenic atom this reduces to 



Sbo. 60} 


RELATIVISTIC THEORY OF HYDROGEN ATOM 


507 


^ + 2 ) n»(7 + 1X21+1)’ 

^^(n, 

The corresponding expression for the energy difference of two fine- 
structure levels belonging to the same parent ?i,l level is 


AFs 


a^Z^Nhc 

nH{l + iy 


(59-15) 


The above formula with Z set equal to unity is directly applicable to 
the hydrogenic orbits of the alkalies and ions of the same structure, but 
in most cases the doublets for these states are too narrow to be measur- 
able. In the case of almost hydrogenic states (slightly penetrating orbits) 
it can be used with the substitution of an appropriate effective value of 
the nuclear charge. In order to adapt the formula to the doubl(‘ts of 
deeply penetrating orbits it is useful to employ a model in wliich the 
potential-energy function F(r) follows a C'oulomb law with an effective 
nuclear charge ze (z I for neutral atoms) outside th(? core radius rj 
and the same law with a larger effective nuclear charge inside ri. 
The average value of F'(r)/r [cf. Eq. (58-5)] is then a suitable weighted 
average of the values ze^/r^ and Zie^/r^ appropriates to the outer and 
inner parts of the field. ^ Assuming that Z* is much larger than z one 
obtains the Land4 formula for the doublet spacing, viz^^ 


A 7P — oi'^Zi^z^Nhe 


(5916) 


Here n* denotes the effective quantum number defined by (56*8). This 
equation checks very satisfactorily with the experiment data,* and 
reflects clearly the previously mentioned rapid decrease in the doublet 
spacing with increasing values of n and 1. 


60. THE APPROXIMATE RELATIVISTIC THEORY OF THE HYDROGEN 

ATOM 

Of course the hydrogen atom and the structurally similar ions He+, 
Li++ form the simplest atoms with a single valence electron. The 
above theory should apply most accurately to them. In comparing the 
theory with the observed energy-level systems of such hydrogenic struc- 
tures we must remember, however, that in first approximation the 
energy levels of hydrogen for a given total quantum number and different 
values of I are coincident. We should therefore expect tht^ group of 
levels due to spin-orbit interaction for any given value of n to contain 

^ Cf, Pauling and Goudsmit, Tlie Structure of Line Spectra, pp. 60-63, New York, 
1030, for details. 



508 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


271 — 1 components, one for the s state and two for each possible 
value of I greater than zero. Actually the number is less thijn this 
due to the existence of a relativistic eiffect which we have so far ignored 
and to a residual degeneracy which eludes the combined action of the 
spin-orbit and relativistic terms in the Hamiltonian. The relativity 
correction in question is unimportant for the alkalies because it is small 
compared with the uncertainty in the theory due to the lack of an exact 
potential function V{r) and to the essential inaccuracy of the single- 
electron model we have used. It is most important for the discussion 
of the low-lying X-ray energy levels. 

The “pre-Dirac” relativistic equation for a single electron, (6*8), is 
not in the standard form (// — E)^ — 0 and is not easily put into it. 
It differs, howevc^r, from the non-relativistic equation (5*5) only in the 
/2t\^ ^ 

addition of a term ( ~ whicdi is ordinarily small compared 

with the other terms. Strictly speaking we cannot add this term to the 
ordinary Hamiltonian because it contains th(^ unknown parameter E. 
We know^ the approximate value of Ej hown^ver, by solving the equation 
without the {E — term and can correct this by substituting the 
approximate value, say A'o, for E in that term. We thus obtain the 
equation^ 



\E, - V{t)Y 

2mc2 


- = 0 


(601) 


in which Ha — plays the part of the 2)<^rturbed Hamiltonian. 

The relativistic energy-level correction En for a state with total quantum 
number n and azimuthal quantum number I is then the mean value of 

~ g - for the state in question. In the case of a hydrogenic 

atom with nuclear charge Ze, 


Er — 


+ v) 


+ 2EoZe^- + 
r 


p)- 


The mean values of l/r^ and 1/r arc given in Eq. (59*12) and footnote 4, 
p. 505, respectively. Hence 


a^Z^Nhc\ 1 _ 

_l + M 


(60*2) 


^ The eigenfunctions of this equation for different values of JS?o are not rigorously 
«^rthogonal. 



Sbc. 60] 


RELATIVISTIC THEORY OF HYDROGEN ATOM 


509 


Adding this to the spin correction we obtain the following approximate 
expression for the combined energy-level correction of relativity and the 
spin-orbit interaction : 


ER^.s{ny Ijl + }^'2) “ 


E If I ~ 


a^Z^Nhc 


a^Z^Nhc 


1 

I + 1 

1 _ 1 . 

i 4n 


4n 


(60-3) 

(60*4) 


In this approximation, therefore, despite the perturbations introduced, 
there is a residual degeneracy due to the fact that 

E{n, Z, I + }4) = E(n, 1 + 1,1 + (60-5) 

As a result of this residual degeneracy it is appropriate to label the 
energy levels of hydrogcm-like atoms by the corrc^sponding j values 
rather than by the I values. We can then replace (60-3) and (60-4) by 

En+s{n, J - }i, j) = Eu+s{n, j + }i, j) = 

( 6 ()' 6 ) 

The degeneracy is not removed by a more accurate theory as proved by 
DiracAs exact treatment of hydrogenic atoms on the basis of his rigorously 
relativistic quantum theory of the electron.^ 

The agreement between (60-6) and the observed sp)ectra of H and He'+ 
is too well known to require comment here. 

The formulas for spin-orbit interaction and for the relativity correc- 
tion are both of great importance in connection with X-ray energy levels. 
These are the imperfectly quantized energies of the ions formed by the 
removal of an inner electron from a relatively heavy atom. Their 
differences can be approximately identified with the negatives of the 
corresponding differences of the energy of the electron removed, if the 
latter is treated like the optical electron of the alkalies using a single- 
electron model in which the effective potential eiuTgy V (r) is made up 
of the nuclear potential energy and an additional radial function to 
take the place of the average interaction potential with the other elec- 
trons. F(r) is roughly hydrogenic for the inner electrons due to the 
dominance of the strong nuclear field. Hence Eqs. (59T4) and (60*2) 
can be used with appropriate effective values of Z. However, on account 
of the large values of Z^u in the highest levels of the heavy atoms, our 
approximate treatment of the relativity correlation is not altogether 
satisfactory. The order of the fine-structure levels is, of course, reversed 
when we pass from the alkali spectra to the alkali-like X-ray spectra. 

^ P. A. M. Dirac, P,Q.M., 1st ed.. Chap. XIII; 2d ed., Chap. XII. 



610 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


61. THE PAULI WAVE-MECHANICAL FORMULATION OF THE THEORY OF 

ELECTRON SPIN 

61a. Nature of the Configuration Space and Wave Functions. — Up to 

this point our application of quantum mechanics to the electron-spin 
hypothesis has been based on formal analogy for the purpose of arriving 
as quickly as possible at a quantitative check with the observed fine 
structure of the alkali spectra. Before applying the theory to the study 
of atoms with more than one valence electron we pause to develop a 
more complete quantum-mechanical formulation of the ideas roughly 
outlined in Sec. 58. 

A very natural procedure for setting up a complete spin theory in 
wave-mechanical form would be to treat the electron as a rotating 
charged sf)herical rigid body whose position is fixed by Eulerian angles 
6y yp. We should then use a wave equation for the free rotational 
mention of the electron like that of a symmetrical top with three equal 
moments of inertia. This procedure leads, however, to the same set of 
eigenvalues for the resultant angular momentum and for the component 
parallel to any fixed axis as we get for a system of particles. It does 
not give the half-integral spin quantum numbers demanded by the 
spectroscopic facts, nor would it give the single eigenvalue of the resultant 
§2 which is observed experimentally. We might perhaps account for the 
absence of many empirical values of as a result of the impossibility of 
exciting higher internal-energy states, rather than as an evidence of their 
non-existence; but the half-integral quantum numbers impose a well-nigh 
fatal objection to any attempt to write out probability amplitudes 
with the Eulerian angles as arguments.^ This fact is not very surprising 
when we reflect that the angles in question are essentially unobservable. 
We have, in fact, no use for such probability amplitudes except to com- 
plete the formal parallelism between, the theory of spin angular momen- 
tum and that of orbital angular momeutum. 

Pauli has shown, however, that it is possible to give the electron-spin 
theory a formulation in terms of probability amplitudes without introduc- 
ing coordinates which specify the orientation of the electron or that 
of its spin axis. 2 To this end we introduce the eigenvalues of spin 
angular-momentum components, or multiples of them, as new arguments 
for the wave functions. We have already become familiar in Sec. 36 
with probability amplitudes some of whose arguments have discrete 
spectra. Only the eigenvalues of normally commuting dynamical 
variables can appear as simultaneous arguments in any probability 
amplitude. We attribute to the components of the spin the commutation 
properties we have already found to be characteristic of the orbital 

* C/., however, C. G. Darwin, Proc. Roy. Soc. A115, 1 (1927). 

* W. Paitli, Jr,, Zeits. f. Pkysik 43, 601 (1927). 



Sec. 61] THE PAULI THEORY OF ELECTRON SPIN 511 

angular momentum. Hence we can use at most the eigenvalues of one 
of the spin components in any one wave function. It is customary to 
choose Sz for this purpose. The square of the resultant spin angular 
momentum could be introduced as an additional argument, but this is 
unnecessary if we make the assumption that only one eigenvalue exists. 
We assume that each of the three dynamical variables S*, Sy, whose 

operators we wish to define has just the two eigenvalues It will 

2 2t 

be convenient to introduce two alternative symbols, viz., a and for 
the quantity which we call the quantum number of S«. Thus 

either a or lUs denotes a quantity which can take on just two values, 
±3^* 111 place of the eigenvalue S/ itself we ordinarily us(^ the 

(piantum number a as the fourth argument, or spin coordinate, of the 
probability amplitude of a single electron. Let a{a), or denote 

that function of cr which is unity when a is + *2 zero otherwise. 
Let Pier), or denote that furn^tion of a which is unity when a is — 

and zero oth('rwis(\ Any i)robability amplitude for the coordinates 
X, y, Zj §2 can tlnai be written in the form 

= Ua(x\y' ,z')a{a) + ii&{x\y\z')^{a). (6M) 

Here Ua and Up are simply two different physically admissible functions 
of the space coordinates x/, y\ z* . If we wish to indicate that a wave 
function is an eigenfunction of with the eigenvalue S/, we introduce 
wifi as the quantum number corresponding to %z and attach it as a sub- 
script to the ^ symbol. Thus 

denotes such an eigenfunction. 

In harmony with the procedures of Sec. 36 we assume the nor- 
malization condition 

X = 1 . 


2 + 

+ (uaUp* + Ua*Up)a{a)fi(<T)}dx'dy^dz = 1 . 

This reduces at once to 

/ f + Mndx'dy’dz' = 1. (61-2) 

With this normalization we shall interpret \ua\^dx^dy^dz' as the probability 
that an electron lies in the volume element dx'dy^dz' and that at the same 



512 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


time <r has the value \u^\^dx'dy^dz* is the corresponding probability 
for the opposite value of <r. Thus the introduction of the spin com- 
ponent Sz in effect replaces the three-argument wave function of the 
spinless electron by two such functions. 

The reader will reajdily verify that if F is an operator function of the 
coordinates x, y, z and their momenta and has an adjoint Ft with respect 
to class D functions in x'^y'^z' space, it has the same adjoint with respect 
to class D and the enlarged x^y'yZ^a space. If F is Hermitian or unitary 
in x'jy'jZ' space it re^tains these properti€\s in x\y',z\(T space. 

It is important to note at once that if we introduce? the new wave 
functions into the old three-dimensional Schrodinger equation //^ = Exp 
and ask for eigenvalues and eigenfunctions we obtain the same eigenvalues 
and multiples of the old eigenfunctions. Thus it is necessary that 

HUaOiio) -+■ Hu0fi(a) = EUaOt{a) + Eufifiia). (61*3) 

Multiplying this equation by a(a) and summing up over the two pOvSsible 
values of a we obtain Hua = Eua ydth a similar equation for up. Thus 
both Ua and must be eig(?nfunctions if xp is one. Conversely, if Ua 
and Ufi an? eigenfunctions with the same eigenvalue, it follows that xp is an 
eigenfunction. If xp is a simultaneous eigenfunction of all members of a 
complete set of commuting observables, say H, we can always 

write 

yp == UE,i,mi{x\y' ,z')[caa((T) + Cfi^{(r)]y (61*4) 


where Ca and cp are complex numbers. If UE,i,mi is normalized in the 
usual way, Ca and are subject to the normalization condition 

== 1. (61*5) 

61b. Preliminary Discussion of Spin Operators and Spin Matrices. — 

If we set either Ca or equal to zero, xpa.i.mi becomes an eigenfunction of 
Sz. Thus we obtain a complete set of simultaneous eigenfunctions of 
Hy £>xy and S*: 

ypE,l,mi,mix' yX/ ,z' yCr) = yy' yZ') (61*6) 


As an arbitrary physically admissible function of x'yy\z\a is expansible in 
terms of the above set, the equation 



(61*7) 


is suflScient to define S*. Clearly is a dynamical variable which com- 
mutes with all operators which are functions of the Cartesian coordinates 
x'yy^yZ^ and their momenta, but are independent of the spin. It has the 
property that, when applied to the product of a function of the space 
coordinates x\y',z* and a function of the spin coordinate <r, it acts only 



THE PAULI THEORY OF ELECTRON SPIN 


Sec. 61 ] 


513 


on the latter. In other words we can treat as parameters in 

applying 

We assume that the other spin operators, 8 * and share the same 
general properties as 8 ^, but do not commute with Since they act 
only on the spin functions, they arc determined by four-element matrices. 
Thus 


$xa(<r) = /3((r)Si(|8,a) + a(o-)Si(a,a)./ 

In a matrix scheme in which and a set of spin-free operators 71 , 72 , * * * 
are diagonal, the matrices of all three spin operators are not only diagonal 
with respect to the 7 ’s but have (dements caitirely ind('pendent of the 
valuers of the 7 ’s. Thus, if E' and are discrete energy levels of a 
spiii-frcie Hamiltonian, 

-I'aO = X u*,..f.nAx',y',z'Mo-) 

z',!/\z\a 

X ,?/ fZ )^(£r) = 8 i>y , , i'* 8 nii' , m 

rr 

= 8E',K"8l>,i"8mi',tnr^x{oijl^)- (6T9) 

Wo can accordingly suppress the indicc^s 2?, I, mi and treat 8 x, 8 *,, 
and 8 ^ as (2 X 2)-element matrices whose rows and columns are labeled 
by Ms values. 

The operators Sx, $y are determined by their matrices. In order to 
fix the latt(^r it is convenient to make use of an analysis which is equally 
applicable to the matrices of the orbital angular momentum, the spin 

angular momentum, and the n^sultant g of tln^ sjiin and orbital angular 
momenta. Hence it is worth while to treat the general case first and 
to consider the specialization appropriate to the spin later. ^ 

We assume (c/. Sec. 58c) that the rul(‘s for the commutation of 
8 x, 8 y, 8 * among themselves are the same as for ii®, luy, Zz, They then 
form a special case of the operators a, jS, 7 of Sec. 40/. The eig(m values 

we have postulated for the (jomponents of 8 and for 8 ^ are in harmony 
with the results obtained in Sec. 40/ for the eigenvalues of 7 and co^. 
We accordingly proceed to the general problem of working out the 
matrices of the operators a, jS, 7 , 

Let p denote the dynamical variable a + 2 j 8 , whose analogue + iZy 
proved so useful in Sec. 40/. Due to the Ht^rmitian character of a and 

‘ M. Born, W. Heisenberg, and P. Jordan, Zeits. f. Physik 35, 557 (1926); 
P. A. M. Dirac, P.Q.Af., 2d ed. §§48, 40. 



614 * 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


the dynamical variable a — 
It follows from (4016) that 


is adjoint to p and can be labeled p^. 


YP 


= + y)' 


(61 10 ) 


where I denotes the unit matrix. 

We assume a matrix scheme based on simultaneous eigenfunctions 
of 7 , and such additional commuting Hermitian ojx'rators pi, P 2 , ' • * 
as are needed to make a complete set of normally (iommuting variables. 
Let I and m denote the quantum numbers of w- and 7 as defined on p. 316. 
It is assumed that and 7 as well as a commute with the p*s. The 
matrices a, Y> will then be diagonal with respect to I (cf. 

Sec. 43, p. 351) and the p's. Introducing an integer r, each value of 
which denotes a set of simultaneous eigenvalues of the p’s we indicate 
a typicak element of g which is diagonal with respect to I and the p\s by 
pTi{m\m'^). All elements of 9 , pt, a, g, 7 , <*)" not expressible in this form 
are zero. The totality of the elements of g for a single pair of values 
of r and I form a square matrix which is conveniently designated by pr/. 
Equation (61-10) now leads to 


yrign = gri\ 


( yw + yy 


(6MI) 


Here hi is a unit matrix with the same number of rows and columns as 
pro Ytz, etc. As 7 has only diagonal elements of value m/i/27r, this 
matrix equation is equivalent to 

(m' — m" — l)pr^(w^',m") = 0 . (61*12) 

Thus all elements of gri vanish except those on that line parallel to 
the principal diagonal for which m' = m" + 1. Let 

pri{m + l,m). = Oriim). 

Then pTZ^(m,m + 1 ) = ariim)*. All other elements vanish. 

In order to evaluate Oriim) we make use of the matrix equation 




2 ^ 


(6113) 


which is a corollary on (40-18). Introducing appropriate element values 
into (61-13) we obtain 

|ar,(«i)p = Q^m + 1) - + 1 )]. (61-14) 

It follows that 

Oriim) ^ + l|p|r,Z,m) * + 1 ) — m(m + 1 ), (61-16) 

where v(T,l,m) is a phase constant which cannot be determined by matrix 



Sec. 61] 


THE PAULI THEORY OF ELECTRON SPIN 


515 


algebra. It has definite values for every set of indices when the matrix 
scheme is related to a set of wave functions with definite phases 
according to the basic formula (44*2). As the phases of the wave 
functions are arbitrary, the constants j/(T,Z,m) are also arbitrary in a 
sense and are usually sef equal to zero. 

With the above choice of phases the relations a = J^(p + p^) and 
i8 = + p'*’) yield 

ari(m',m") = 

+ ViCf+ l) - m'(m' + (6M6) 

+ V^Cr+l) (61-17) 

Let us now apply these general results to the spin angular momentum 
of an individual electron. In this case we replace the symbol m by 
and restrict its values to and We replace the symbol I 

by s and give it the single value in accordance with the assumption 
of Sec. 58d, p. 500 and the analysis of Sec. 40/. We can suppress s as an 
index sinc(^ it has only the one value. The matrices S^r + iSyr 'and 
Srt — iSyr have only one non-vanishing element. Thus 

(m.'lS„ + iS^\m.") = + 1 

* * 

Let us arrange the m, values for the different rows and columns in 
order of increasing magnitude as follows:^ 


^ + 1 ) 



m. 


-M 

+H 


^ In comparing Eqs. (61 18) and (61-19) with the corresponding equations given by 
other authors the reader must exercise circumspection. Thus Pauli in his original 
article (Zoc. cit.y p. 510) and his article in Geiger and Scheel's Handbttch der Physiky 
24, Part I, reverses the ordering of rows and columns used here. Wigner, Gruppm- 
theorie und ihre Anwendungy p. 251, Berlin, 1931, has interchanged the x and y com- 
ponents of the matrix of S. 




516 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XITJ 


Then, 


SxT + iSyr 


47rl 


0 

1 

1 

0 

t 


II 

0 0 
1 0 

, S 


0 1 
0 0 


_ Ji 

0 i 

0 * ^ 

-1 i 

Syr “ 

4ir 

-t 0 


0 


(6M8) 

(61*19) 


Having determined the sspin matrices, we return to the corresponding 
operators. The matrix (‘.omponent of Kq. (6T9) is the com- 

ponent of SxT for which the row index m/ is +^<2 and the column index 
ms" is Thus the arrangement of the elements in Eqs. (61*8) is 

transposed with respect to the arrangem(‘iits of Eqs. (61*18) and (6T19). 
With this understanding, (61*8) is the type form for the conversion of 
all the spin matrices into operators. 

We have assumed hitherto that the spin operators act only on the spin 
functions when a probability amplitude is factored into the product of a 
spin factor and a position factor. It is formally possible, however, to 
get the same results with a different point of view. Consider th(' applica- 
tion of the operator Sx to the general wave function of (6T1). By means 
of (61*8) we obtain 


= §>x[Ufi0{a) + Uaa{<T)] 

= 'M/3[/3Sx(^,/ 3) + aSx(«,j8)] + tia[/8Sx(i3,a) + aSx(o:,a)] 

= iS[Sx(i8,iS)^^/5 + Sx(i8,«)'aa] + + §>x{oL,a)ua]. (61*20) 


Thus Sx can be thought of as transforming up into Sx(^,/3)ws + SxO,a)^^« 
and Ua into Sx(a,/3)U|9 -|- Sx(«,«)tta. Inserting the actual values of 
these matrix elements, w^e sec that from our present point of view Sx 
transforms into huaf^r and Ua into hu^lAir, Similarly transforms 


Up into 


ihua 


and Ua into 


ihu 


jS. Finally Sx 


transforms up into — 

^ 47r 


Air 47r 

and Ua into hualArr. This interpretation of the operators is very useful 
in practice. 

If we like, we can regard the two functions up{x\y\z') and Ua{x\y\z^) 
as the elements of a (2 X l)-element matrix function of the coordinates 




The above rules then say that 


Sxtt 


Sx(j3,j8)% + Sx(^,a)ua 
Sx(a,l3)up + Sx(«,a)i^a 


= Sxrtlf. 


(61*21) 


(61*22) 


(4 X l)-component matrix wave functions similar to the above are used 
in Dirac^s relativistic quantum theory of the electron. 

It is of considerable importance to be able to make transformations 
of tjae spin wave functions and spin operators from one set of coordinate 



THE PAVU THEORY OF ELECTRON SPIN 


Sec. G1] 


517 


ax(\s to another. 1'he theory of these transformations was first worked 
out by Pauli. ^ 

Let the new coordinate axes £•, z ])e derived from the original set 
Xj jjj z by rotation through the Euleriaii angles ^ (c/. Fig. 14), Sec. 34/. 
Since the angular-momentum operators which we havn^ used 

as models for the* construction of Sx, .S., form a vector — as indicated 
— ► — ► 

by the notation £ — we postulate that S is a vector, f.c., that 

Si = oSx cos(a:,x) + S„ co^Ulx) + (*os(;s,.r),^ 

Sy = Sx cos(a:,77) + cosOy,;-y) + oS, <‘os(^,//),> (61-23) 

S- = Sx C 0 i^{XyZ) + Sy cos( 2 /,l) + S^ cos( 2 :,l).) 

L(^t T denote the opc'rator whi(4i transforms an x^y^z spin wave function 

((r|) into the corresj)ouding XySyZ spin wave fiuiction (a|). The possible 
values of the argunn'iit d an^ th(^ eigenvalues of Si, each multii)lied by 
2T/h. Of course tln^se valu(‘s must be + ) 2 if •‘^et of axes is to be 
fully equivalent to the other. Let {a\d) denote the ('igc'nfunction of 
% in the x^y^z system for the eigenvalue dhl2'K. By the fundamental 
equation (36-76), 


(<^l) = ^ (61-24) 

It follows from p]q. (36-5) that the transformed expressions for the opera- 
tors Si, %yy Si, to be used in connection with the wave functions ((f|), are 

Si = ly = T%yT~\ Si = T%iT-\ (61-25) 

In order that the two set"' of axes shall be fully (equivalent it is necessary 
that Si, Si/, Si shall be identical in form with Sx, S^, S^, respectively. 
The fact that this equivaleiu^e can bo attained by a proper choic^e of the 
phases of the ('Jgcmfunctions (o-|a) is proof that despite th(» asymmetry 
of its external form the Pauli spin theory is in harmony with the isotropic 
character of space. ^ 

Since the spin operators have pundy discrete spectra with two eigen- 
values each, the fun(.*tions (( 7 |< t )*, i.e,, {d\(T)^ form a matrix with two rows 
and two columns which we call T, thereby establishing a parallelism 
betwe^en (61-24) and the first of Eqs. (44-15). In place of Eqs. (61-25) 
we can use 

Sx = TSxT-i = TSxTt, (61-26) 


1 Loc. dt., footnote 2, p. 510. 

* A more complete discussion of this question with a derivation of the transforma- 
tion matrix (61-30) is to be found in E. Wigner^s GrnppeMheorie und ihre Anwendung 
auf die Q'wantenmechanik der Atomspektreriy Berlin, 1931. 



518 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


with corresponding equations for Sg and 5*. We have now to choose the 
unitary matrix T so that 

TS*T-i - S*, TSflT-i = Sy, TS-.T-i = S,. (61-27) 


The existence of a solution of Eqs. (61*27) insures the existence of the 
eigenfunctions (a*|^) and defines the operator T of Eqs. (61*24) and (61*25). 

It will suffice here to deal explicitly with the simple case where the 
x,y,z system is obtained from the Xyy^z system by a rotation through 
an angle p about the z axis. The z axis is then coincident with the z 
axis and the operators cSi and S* are identical. It is nevertheless neces- 
sary to introduce a phase diffeirence between the eigenfunctions of S* 
and §>z in order to make the matrices S* and Sg identical with Sx and Sy, 
respectively. In this ease 


Sx = Sx cos p + Sy sin p = 


4:T\ 


0 

0 


Sg = —Sx sin ^ + Sy cos p = 


4ir 


0 




(61*28) 

(61*29) 


Since S* is identical with the diagonal matrix S^, the last of Eqs. (61*27) 
requires that T shall be a diagonal unitary matrix, i.e.y that it shall have 
the form 


If?" 

0 

|0 



The remaining Eqs. (61*27) yield 

T{\y\)SxM - SxMTMy 
!r(X,X)Si/(X,M) == Sy{'Kyn)T(jXyti). 

Inserting the values of the matrix elements, we obtain the four equations 
g-G+i) ^ 

all of which are satisfied if we set jS — a == p. Symmetry suggests that 
we give and a the values ±^; respectively. From the definition 
of T it follows that the functions are 

= eh{a)y (crj + J^) - 6~?a((r). 

The complete transformation matrix T{py6y\p) for a rotation with the 
arbitrary Eulerian angles py By as given by Pauli ^ and Wigner,^ is 

^Loc, city footnote 2, p. 510. 

*Z/Oc. cU,y footnote 1, p. 515. 




Sbc. 61 ] 


THE PAULI THEORY OF ELECTRON SPIN 


519 




2 COS ^6 e ^ 

* 5 ? 1 

sin e ^ 


—e 


sin 5 e- 
1 ^ 

COS 


(61-30) 


61c. Application of the Pauli Theory to the Alkali Doublets. — As a 

simple exercise let us reexamine the problem of the spin-orbit interaction 
for the alkali atoms in terms of the Pauli wave-mechanical theory. 
Let Ho denote the unperturbed Hamiltonian of the atomic model used 
in Sec. 56, i.e., 

Ho = + F(r). (61-31, 

The complete Hamiltonian is then 


H === Ho + 



(61-32) 


if the relativistic corrections are omitted. We write down the cor- 
responding Schrodinger equation, set ^ = a{a)Ua + and make use 

of the properties of the spin operators as formulated in Sec. 616 to obtain 


a(<r)|(H„ - E)u. + 0 
+ /3(v)|(H„ - E)u, + ( 7 ^)- 


(rCaj^Of 'tSuyU/fx £ji 




= 0 


(61-33) 


Since this equation is to hold for all values of a, the cofactors of a(<r) 
and / 3 ((r) must vanish separately. Hence (61-33) is equivalent to the pair 
of simultaneous equations 


(Ho - E)u„ + 
(Ho — E)up + 


( ^ ~~ f<Cv)w/5 + 

\r dr ) Sr^nV 

( 1 dF Y(cCa; + i£>y)Ua — £zUff]h 

\r dr / St“ 



(61-34) 


To get the appropriate zero-order wave functions and the first-order 
energy corrections we can enter the equations with expansions such as 

M« = 2 (61-35) 

mz » —I 
-i-l 

%= 2^”'^”'®*”'^*“’ (61-36)' 

WJ*— I 

and attempt to determine appropriate values of the constants by 

conventional perturbation theory. Or, we can seek first to find such 



520 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


valuGvS of the coefficients as will make ^ from the beginning a simultaneous 
eigenfunction of //o, ii-and the operators and defined by the equations 

+ Sy, 0I2 = <£3 + + gy^ + gz^ 

(01-37) 


[cf. Eq. (59-1)]. Let us choose the latter method. 

Our first stop must be to provo the usefulness of Sh by verifying the assump- 

tion that they oommute with the perturbed Hamiltonian TI of Kq. (61-32). Since H 
is symmetric with respect to the three (U)ordinate axes, the cominutivity of with H 
impli(vs the coinmutivity of gj-f g»y g" with 11. Hence we need only to exfimine gz. 
We know that Sg commutes with and Avilh any function of the coordinates Xy y, z. 
Zz commuU^s with any function of the radius \cf. K(p (38-11)]. Hence 


1 dV 
r dr 






1 ([V 
r dr 


((«S-Ai)^h - o^h(<S-i3)] 


1 dV 
rllP 
UIV 
r dr 


^jr(£x£2 ■£4'"£a:) ^y^z) 

4 - - ^y(£z£y - £y£z)\ - 0 . ( 61 - 38 ) 


This proves that gg commutes with the perturbing terms in (61-32). Consequently 
gzi gj-} gyj g^ commute with that term and with the complete Hamiltonian TI. 


The functions of Etfs. (61-35) and (61-36) an^ from the beginning 
eigenfunctions of //o and i3-. Consider next the operator gz. Every 
simultaneous eigenfunction of £>z and is also an eigenfunction of gg. 
Let Yi,mi denote the product We can then say that RniYimia{a) 

is an eigenfunction of gz with the eigenvalue 

, __mh __ {nil + *2)^ 

27r - "27r 

Th^ most general simultaneous eigenfunction of //o, <£ 2 , and gz is 

^n,l,m = Rnl[C“^a{<r)Yl,m-\^ + CSj3((r) 1 (61-39) 

Let US next try to choose C" and so that the above expression will 
become an eigenfunction of g^. Usings as tlu^ quantum number of g^j we 
denote the resulting simultaneous eigenfunction by \l/n,i, 3 ,r>f In view of 
(59-2) 

^g^nXum = [l{l -+* 1) + S{S + 

+ ^{«(o’)[C'm(£a: — iZy)Y + C^£>gY 

+ ^((r)[C“(£. + i^tv)Yi,m^y ~ Ci£zYi,r.^y;[}\ (61-40) 

The matrices Lx + iLy and Lx — iLy are special cases of the matrices 9 
and of p. 514. The only nonvanishing elements of the former are 



Sec. 61 ] 


THE PAULI THEORY OF ELECTRON SPIN 


521 


{mi + 1 |£* + i£y\mi) = Q^\l{l + 1 ) - (”• - 0 ('" + 01 ’ 
[cf. Eq. (61-16)]. Hence 

(£i + t£ii) 

Similarly, 


l{l + 1) - m2 + 


^Ty 

"h 1) ~ Y 


We have to solve 




Sctting the cofactors of a{<r) and ^(<r) ecjual to zero, and introducing 
the contra(^tion.s 

Q = l{l+l) + H -j{j +1), Z s \lil + 1) - m2 + (61-41) 

we derive 


(Q + m- ]i)Cl + ZCi = 0,\ 

ZCl + (Q - m - = 0./ 


(61-42) 


The condition for a nontrivial solution is 


(Q - ]i + m){Q - - m) - Z2 = 0. 


Hence Q must have one of the two values f + 1 or — 1. 
j = I — H and 


C“ ^ _(l + U- mV-^ 

V + Jf+W ■ 


In the former case 

(61-43) 


In the latter case j = if + } 2 and 

Cl ^ (I + 14 + mV^ 
Ci \t+li- mj • 


(6144) 


It follows from the normalization condition (61-5) that if j = I —]4 and 
we give Cl a positive real value 


/Z + M - mY^ 

\ 2Z + 1 ) ’ 


while, if J = Z -f 3^, 


Ci = 


fl + H + 

V 2Z + 1 7 ’ 


tl+H+ mY 

\ 2Z + 1 ) ’ 


Ci = 


/Z + M - m^^ 

V 2Z + 1 / ■ 



522 


ATOMIC STRUCTURE AND ELECTRON SPIN [Chap. XIII 


The resulting zero-order wave functions of H are 






+ (l + 1 - m) 


Y l,m+y^0{o 


(61-45) 


y i.m+H0(<r) i 


Let us now apply the perturbation theory of Sec. 48a to determine the 
first-order energy corrections. A.s £2 commute with the perturb- 

ing term of the Hamiltonian (61-32) the secular determinant of (48-7) has 
no nonvanishing nondiagonal elements. Tlie (uiergy corrections are the 

diagonal elements of th(' matrix of ' ^ we find 




lhr {lcW\ 

167rVV\r dr/nj 




{I + l)h 


V \r dr /n,ij 


(61*46) 


in accordance with our previous result. 

If there is a uniform external magnetic fi('ld ,1C, a t(^rm 


should be added to the Hamiltonian of Eq. (61-32). The reader will find 
it a simple, but instructive, exercise to use this term to work out the theory 
of the complex Zeeman effect for the alkali atoms. 



CHAPTER XIV 


THE THEORY OF THE STRUCTURE OF MANY-ELECTRON 

ATOMS 

62. GENERAL FORMULATION OF THE PROBLEM 

62a. The Configuration Space. — The most higlily developed field of 
applications of the nonrelativistic quantum mechanics is in the theoreti- 
cal study of the optical spectra of the elements. The quantum theory 
had its origin in the attempt to interpret these sp(H*tra and gives a highly 
satisfactory account of their main features. Although the mathe- 
matical difficulty of solving the Schrodinger equation for the general 
many-electron atom is so great that a complete set of exact solutions is 
wholly unattainable, we can go much farther with available approxima- 
tions than we can in theoretical studies of the structure of molecules and 
of the solid state. In this fic'ld, moreover, w(' avoid the basic difficulties 
which beset the problem of quantum electrodynamics and the problem of 
nuclear structure. H(;nce it is appropriate to bring the present volume 
to a close with an introductory application of our general theory to the 
central problem of the structure of many-ele(‘tron atoms and their spectra. 

As the wave functions in the Pauli spin theory for a single-electron 
atom are spread out over a four-dimensional (configuration space, the wave 
functions in an atomic problem involving / (electrons require a 4/-dimen- 
sional configuration space. We d(mote the spin coordinate of the fcth 
electron by m,/. or uk- There are 2^ possible s(^ts of values of the / spin 
coordinates and the complete wave function is accordingly the sum of 
2^ different 3/-dim(msional ‘‘space functions'^ u{x\\ y\\ ' * • » 
each multiplied by a different product of delta functions of the coordi- 
nates rritk- 

In normalization, the computation of m(?an values, and the formation 
of matrix elements, we have to work out scalar products over all con- 
figuration space. The process of integration over the 3/-dimensional 
positional coordinate space must then be supplemented by a summation 
over all possible sets of values for the spin coordinates. Using the nota- 
tion introduced in Sec.* 36c we indicate this mixed process of integration 

by the symbol 2). Thus the normalization condition takes the form 

(W) = = 1- (62-1) 

523 



524 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


Functions of the Cartesian space coordinates and the spin coordinates 
which are of class D in th(^ former for every fixed set of values of the 
latter are conveniently reckoned as of class D in the combined set of 
space-spin coordinates, and as physically admissible in 1he 4/-diinensional 
theory. Operators which are Hermitian or unitary with respect to class D 
and positional coordinate space are also Hermitian, or unitary, as the 
case may be, with respect to class D and the enlarged 4/-dimensional 
configuration space. With the aid of the new coordinate space and the 
modified definition of class D we are clearly at liberty to carry over 
bodily the general theory of dynamical variables d(^veloped in Sec. 36 
into the study of problems involving many parti(des with spin. 

62b. The Hamiltonian Operator. — According to the Kramers classical 
model of Sec. 58d the secular torque acting on the spin angular momentum 
of an individual electron is 


f . i?x + A[? X h}. (62-2) 

The corresponding energy of the spin with respect to the electromagnetic 
field 8, tic is 


t/, 



|3C + 


[8 X 



(62-3) 


In the case of a single electron in a Coulomb field, we set 3C = 0 and 
8 = —rZelr^. This reduces the spin energy to the form 


Vs 






(62*4) 


given in Eq. (58*6). The expressions for the electric and magnetic forceps 
become much more complicated, how^ever, for many-particle problems. 
To attempt to take into account the finite speed of propagation of 
electric and magnetic forces would lead us into the previously mentioned 
perplexities of quantum electrodynamics. Setting aside these retardation 
effects, as they are called after the retarded potentials used to compute 

them, and treating the instantaneous field of each electron as if it were 

* — ♦ 

made up of a static electric charge e, a steady current element ev, and a 

static elementary magnet of moment eS/ixc, it is possible to devise a 
plausible expression for the f/s of a many-electron atom in terms of the 
space coordinates and momenta of the electrons, the spin operators, and 



Sec. 62] 


GENERAL FORMULATION OF THE PROBLEM 


525 


the potentials of the external electromagnetic field. ^ We shall not 
undertake here, however, either an (examination or application of the 
complicated Hamiltonian arriviid at in this way. The spin energy is in 
any case relatively small, and the difficulties of an exact treatment of the 
(el(»(;trostatic (energy are so gre^at as to render somewhat superfluous any 
att(empt to deal very accurately with the spin-orbit and spin-spin energies 
of the best available Hamiltonian operator. 

We are therefore led to sec^k the simpk'st approximate (^xx)rossi()n for 
the spin energy which will give a qualitative representation of the ('xperi- 
mental observations regarding the fine structure of speclrum lines. To 
this end we lu^glect the magnetic for(‘(^s acting on eac^h ele(*tron due to the 

others and idemtify the field T, of l^kp (62*3) with the sum of the external 
electric, field and the average fi('ld acting on th(^ (‘hrtron as a result of the 
electrostatic forces coming from the nucleus and the other electrons. In 
other words we set 

-♦ r fIV ^ 


where 8' is the external electrics field and V{r) is identical with the poten- 

tial of To of Sec. 576. is identified with thc‘ external magnetici field X' 
at the y)oint in question. Thus th(‘ (•lassi(*.al ex}.)r(\ssion (62*3) gives 
rise to the sum of two tc'rrns and U/ in the Hamiltonian oixrator, one 
representing the spin-orbit en(‘rgy, the other the mutual (‘iiergy of the spin 
and the (external fic'ld. Explicitly these* terms are [cf. Eq. (58*5)] 


dvk 

k-=i 


(62.5) 

(62-6) 


The term Vs is somewhat arbitrary, as F(r) mi'ans nothing unless we 
base the calculation of atomic energy levels on a perturbation comx^uta- 
tion in which the starting point is the solution of a well-chosen dc'generate 
central-field problem of the type described in Sec. 576. Since this is 
about the only possible method of attack on complicated atoms, the 
use of (62*5) is not unsatisfactory. 


1 (7/., e.gr., J. Frenkel, Wave Mechanics, Advanced General Theory, section 38, 
Oxford, 1934. Other treatments of this subject are to be found in the following 
references: W. Heisenberg, Zeits. f. Physik 39, 499 (1926); W. Pauli, Jr., Zeits, /. 
Physik 43, 601.(1927); J. A. Gaunt, Proc. Roy. Boc. A122, 513 (1929), Trans. Roy. Soc. 
228, 151 (1929); G. Breit, Phys. Rev. 36, 383 (1930). 



•526 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


In the absence of an external field the complete Hamiltonian for an 
atom, or atomic ion, neglecting the nuclear motion as in (17 2), becomes 






A; = l 


, dVjn) - 

iTki 2tiV^rk dvk 
k>j fc = 1 


£k. 


(62-7) 


We arbitrarily resolve this Hamiltonian into the sum of three terms as 
follows: 


where 


H = Hk + Hi+ Us, 
/ ! 


k^\ 


"■- 2 ^. - 2^ -2 


k>3 


ifc=J 


n 


k^\ 


(62-8) 

(62-9) 

(6210) 


Vir) is to be so chosen as to make the elements of the matrix Hi as small 
as possible. This choice makes //o as good an approximation to H as is 
possible without altering its fundamental fonn. 

62c. The Perturbation Form of the General Atomic Problem. — As 
explained in Sec. 576, the usual method of approach to the problem of the? 
many-electron atom is through a perturbation (‘alculation in which the 
unperturbed problem problem A) is of the central-field type with an 
Hq of the form specifi(xl in JCq. (62*9) permitting solution by the method 
of the separation of variables. In fact the whole theoretical significance 
of the empirical analysis of the energy-level systems is tied up with this 
method of approach. The energy levels of the unperturbed problem arc* 
defined by a set of values of the Buh^ quantum numbers n and Z, one pair 
for each electron. All energy levels of perturbed problem originating 
in a common central-field energy level are said to belong to a common elec^ 
ironic configuration. In specifying the individual electron states which 
define the configuration it is customary to use the spectroscopic term sym- 
bols 1«, 2s, 2p, • * * , introduced in Sec. 56a. If / electrons have the same 
pair of values of n and Z, the corresponding term symbol is provided 
with the exponent /. Otherwise the symbol for a configuration is written 
formally as the product of the term symbols for the individual electrons. 
For example, the configuration which gives rise to the lowest diffuse- 
series level for sodium is symbolized by ls^2s^2p^Sd to indicate that the 
core contains two Is electrons, two 2s electrons, and six 2p electrons, 
while the valence electron is in a 3cZ state. 

The straightforward way to carry through a first-order calculation of 
the perturbed (true) energy levels would be to form the matrix Hi -I- XJs 



Sec. 62| 


GENERAL FORMULATION OF THE PROBLEM 


527 


witli unperturbed \vav(‘ furieiions and diagoi\alize separately the step of 
tliis matrix belonging to each individual configuration. This proc(\ss is 
difficult, however, owing to the great difference between the symmetry of 
the unperturbed problem and that of the actual problem. Hence it is 
desirable, where i)ossiblo, to make use of the fact that in a large and 
particularly important class of eases the electrostatic interaction Hi is 
much more important than the spin-orbit interaction [//?. 

In Sec. 58a we saw that the major energy levels of the lighter atoms, 
and to a large extent of the heavier atoms as well, can be classified by 
means of two quantum numbers L and S. The first of these was i)ro- 
visionally interpreted as the quantum number of the resultant orbit 9.1 

angular momentum £ because it ob(^ys the ))rinciple of sc^lection AL = 0, 
± 1. The second determines the multiplicity of the system to which the 
level belongs and in the lighter atoms is subject to the principle of 
selection AN = 0, which forbids intersystem comlnnation lines. In 
Sec. 585 we verified that the number of fin (‘-structure components of the 
various major lc‘vels agrees with the numl)er to be (‘xp(‘Ct(‘d if S is inter- 
pr(‘t(xl as an angular monumtum quantum numl)er like L and the cor- 
responding angular momentum vector 8 is w(‘akly (*oupled to £. Finally, 

in Sect. 58(f we identified cS in tint casct of the alkalies with the spin angular 
momentum of the valence (electron. It will be evidcmt that in the more 
gi‘neral case wlu're there arct many electrons outside the inert spherically 

symmetric cont, we can hardly fail to identify 8 with the resultant spin 
angular momtmtum of the valence (dectrons. In due course we shall 

prove that the* resultant spin angular monumtum of the core is zero, so 

— ► 

that S can equally well be identified with tlu^ rc'sultant spin angular 
momentum of the entire atom. 

The above id(*ntifi(*ations imply that (a) and are approximate 
integrals of the motion and that (5) thc^y are (in the same approximation) 
functions of the energy. We know from Secs. 40/ and 586 that both of 
these conditions are satisfied insofar as is concerned if we nc^glect the 
spin energy Us and spin coordinates. It is easy to see (c/. Sec. 63a) 
that the introduction of spin coordinates without spin energy does not 
affect this conclusion. If we neglecd Hi as well as Us, remains an 
integral of the motion, but is no longer a function of the energy since 
more than one eigenvalue of £2 is compatible with most eigenvalues of 
Ho. Therefore the suggested interpretations of the quantum numbers 
L and S imply that the approximation in which Hi is taken into con- 
sideration, but Us is neglected, is of primary importance. As a matter of 
fact this approximation is sufficient in a large class of cases to locate the 
major energy levels with reasonable precision. The fine structure can 



528 THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 

then be calculated by a second perturbation calculation in which Us is 

taken into consideration, and the square of the resultant of £ and 8 
becomes a function of the energy. We owe to Russell and Saunders^ 
the first suggestion that the sp(^(*tra of the alkaline earths can be most 
(‘asily interpreted on the assumption that in certain energy levels both 
valeru^e electrons contribute to a quantized resultant orbital angular 
momentum. Hence it is customary to say that we have to do with 
RusselUSaunders coupling wlieiu»ver Us plays a minor roh^ in (H)mparison 
with II B(H‘ause the last stage of an emn’gy-level^ calculation for 

Russell-Saunders coiqiling involv'cs what is called ‘^the union of £ and 8 

to form a quantized resultant Russ('ll-Saund(‘rs cou])ling is also 

— > > 

called ii,8 coiqiling. In this book we make no attempt to discuss 
problems in whi(*h the coupling of the ele(^trons is of any other type. 

For problems of the class under (H)nsideration the perturbation 
calculation //o //o + //i + Us is conveniently subdivided into two 
parts, viz.y //o — > //o + and IIo + Ih H q + Ih + ,Us> Follow- 
ing the nomenclature of Sec. 576 we designates the II o eigenvalue-eigen- 
function problem as A, the //o + Ih probhun as R, and th(‘ complete 
//() + III + Us problem as C. The intermediate Hamiltonian //o + II i 
will be referred to as II b- 

The division of the comph^to perturbation calculation A C into 
two parts, A B and B is advantageous in i)roviding us with a 
theoretical interpretation of the emi»irical L values and their selection 
rule — also as a labor-saving device. There is an economy of effort 
because the hypothesis of Russ(‘ll-Saund(Ts coupling i)ermits us in first 
approximation to neglect the matrix elem(‘nts of Us between different 
energy levels of probhnn R, ('ven though these levels have come from the 
same initial configuration. 

63. PROBLEM B: THE SPIN-ORBIT ENERGY NEGLECTED 

63a. Integrals of the Motion. —We have learned (Sec. 48d) that in a 
perturbation calculation it is usually advisable to choose for the initial 
unperturbed wave functions simultaneous eigenfunctions of the unper- 
turbed Hamiltonian and as many integrals of the perturbed motion as 
possible. The initial matrix of the perturbed Hamiltonian is then 
diagonal in the integrals and only one eigenvalue of each integral need 
be considered at a time. Moreover, any integrals which are functions 
of the perturbed energy are available for the classification of the per- 
turbed levels. 

We have already^ made a study of the integrals of the spin-free form 
of problem B in Sec. 40 and have made use of the results at many points 

^ H. N. Russell and F. A. Saunders, Astrophys, J. 61, 38 (1925). 



Sec. 63] 


A FIRST APPROXIMATION 


529 


in the last two chapters. The integrals discusscHl in Sec.. 40 Ix'long to 
two sets associated with the rotation-reflection group and the pennut.ation 
group, respectively. The addition of the spin coordinate's to the posi- 
tional coordinates brings in the spin operators which are integrals of tlie 
perturbed motion originating like in the rotation-reflection 

group of transformations. On account of the Pauli principle we are 
interested only in those integrals which commute with the aniisym- 
rnetrizing operator 9 of Sec. 42b and constitute actual obs(‘rvab](\s. This 
fact rules out the operators for the individual spins and leave's only the 
(‘.omponciits of the resultant spin 

Iff 

Sx = ^ ^2 ~ 2) Sa -, (6f3T) 

A: = I A- - 1 fc - 1 

and functions of them. The .simplest of these functions which commutes 
with all three of the primary components is 

8“ = tSx^ + Sy" + iSg^. (63*2) 

The vector operator S defined by Kejs. (63*1) conforms to the usual 
commutation rules for angular moiiK'iita and is to be idc'iitified with the 

vector S of Sec. 58. We shall use the symbols aS and Ms for th(‘ quantum 
numbers of S- and 82 as in Eq. (58 2). Each eig(m value of cS^ must be the 
sum of eigenvalues of oSax over all values of /c, for the operators Sa* 
commute, and their matricevs can be made simultaneously diagonal. 
Consequently Ms is an integer, or an odd multiple of * 2 » ac'cording as the 
number of electrons attached to the nucleus is ('V(‘n or odd. By th(^ 
much cited argument of S(^c. 40/ it follows that S is also an integer, or an 
odd multiple of 3 ^, according as the number of (dt'ctrons in tlu' atomic 
system is even or odd. This conclusion giv(\s a th(*orc’tical ('xplanation 
of the empirical observation (p. 495) that enc'rgy k'vt'ls of even mul- 
tiplicity (even values of 2S + 1 ) occur in atoms with an odd number 
of extranuclear ele(;trons, w^hile those of odd multiplicity oc.cur in atoms 
with an even number of extranuclear electrons. 

We saw in Sec. 40/, p. 314, that and K commute with every member 
of the group of the Schrodinger equation for the spin-free form of problem 
B and are therefore implicit functions of the Hamiltonian in the discrc'to 
spectrum. Since the spin operators all commute with 43^ and K this 
conclusion is not affected by the introduction of the s])in (ujordinatCvS, 
(In fact we have already assumed in Sec. 62c that is a func'tion of the 
energy in problem B,) It follows that every energy level of problem B is 
characterized by a definite value of the quantum number L and has 
2 L + 1 mutually orthogonal eigenfunctions which are also eigenfunctions 



530 THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 

of Ste with different eigenvalue's and which are obtainable one from the 
other by the application of the operators ii* ± 

also commutes with every integral of problem 5, nonsymme^tric 
functions of the individual spin operators %kz excepted. Since 

the nonsymmetric spin operators do not commute with the antisym- 
metrizing operator 9 of Sec. 426, we can say that commutes with 
every operator which commutes with Hb and 9* Consequently 
is a function of Hb and 9- ^^ut by the Pauli principle every physically 
admissible wave function is an engenfunction of 9 with the eigenvalue +1. 
This means that for physical purposes may b(^ regarded as a function 
of Hb) for, if an atom is in a state wdiosc' wave func'tion is a physically 
admissible eigenfunction of Hb, it will be correlated with a definite value 
of S. 

This conclusion is at first sight very surprising, especially when we 
verify that a single configuration and a single value of can yield two 
different and well-separat('d energy levels of problem B with different 
values of When it was first discovered experimentally that atomic 
energies had an apparently strong dependency on the associated values of 
S, it seemed necessary to assume a strong spin-spin interaction energy to 
account for the phenomenon. Our analysis indicates, however, that this 
strong dependency of energy on the resultant spin (luantum number can 
exist without any spin-energy term in the Hamiltonian w'hatsoever. 
Actually the energy differfnices of states with different values of S, but 
otherwise} apparently the same, an* of electrostatic origin and simply 
reflect different tyj)es of symmetry in the wave functions. 

Because of the -f- 1 values of Ms compatible with the quantum 
number S, every energy level associated with this quantum number has 
2S + 1 mutually orthogonal wave functions derivable one from the other 
by application of the operators Since these operators commute 

with the total number of mutually orthogonal wave functions of an 
energy level with the orbital- and spin angular-momentum quantum 
nuinhHers L and S is at least (2L + l)(2S + 1), or the product of the 
number of M l values compatible with L into the number of Ms values 
compatible with S, 

The energy levels of problem B, hitherto called “major energy levels,'^ 
are sometimes called “terms to distinguish them from the energy levels 
of the central-field problem A (electronic configurations) and from the 
final fine-structure components of problem C.^ They are classified by, 
the corresponding eigenvalues of and K, The letter symbols’ 

indicating different values of the quantum number L are indicated in 
Table I, Sec. 58a. The eigenvalues of are indicated by an appropriate 
numerical superscript — the multiplicity 2/S + 1 of the corresponding 
system — attached to the letter symbol on the left. Thus ®/S and 
• ^ C/. Condon and Shobtlby T.A.P.f p. 189. 



Sec. 63] 


A FIRST APPROXIMATION 


531 


denote, respectively, a triplet term (S = 1) with L equal to zero and a 
doublet term {S = }/i) with L equal to unity. In order to distinguish 
between the even and odd states correlated with the eigenvalues +1 
and — 1 of the operator K, a superscript 0 may be attached on the right 
to the term symbols of odd states. Thus denot(vs an odd state for 
which L = 2 and D an even state with the same value of L. 

As a matter of fact, JK” is a function of the Hamiltonian //o, so that all 
energy levels coming from any given configuration are either even, or 
odd, as the case may be. To see which configurations give even levels, 
and which odd, we apply the operator K first to the individual solutions 
of the central-field problem, obtained by separating the variables in 

spherical coordinates r, <p. When applied to functions of r, 0, <py the 
operator K replaces 6 hy ir — 6 and by tt -h v?. The Legendre poly- 
nomials JPz(cos 6) (cf. Sec. 27b) are even or odd functions of cos 6 at^cording 
as I is even or odd. Hence the associated Legendre fun(?tionsP/,,nii(cos B) 
are even or odd func.tions of cos 0 according as I — m is even or odd. 
Thus K reverses the sign of Pz,j,„,(cos 6), or leaves it unchanged, according 
as I — m is even or odd. But == if m is even and 

if m is odd. Thus is even or odd with respect to K according as I 
is even or odd, independent of m. 

Let us next consider a product-form eigenfunction of the many- 
electron atomic problem in the limiting case A form. Evidently the 
effect of K is to introduce a factor ( — 1)^^ for ea(‘h electron. Hence a 

/ 

configuration is even or odd according as h- is even or odd. 

I 

In considering the problem of degeneracy for a Hamiltonian operator 
H in the light of the Pauli exclusion principle it is necessary to distinguish 
between two kinds of degeneracy. The rruitheMUitwal degeneracy of an 
energy level is the total number of linearly ipdependent eigcaifunctions 
of Ekj regardless of sex, race, or previous condition of servitude. The 
discussion in Secs. 40e and / refers to this kind of degeneracy. On the 
other hand we define the physical degeneracy of Ek as the number of 
linearly independent antisymmetric eigenfunctions associated with Ek^ 
provided, of course, that the system under consideration contains at 
least two electrons. Physical and mathematical degeneracy are to be 
identified for systems that contain no pairs of identical particles. The 
physical degeneracy gives the statistical weight gk introduced in Sec. 54a 
for the calculation of population densities, since only antisymmetric 
wave functions can support a population. 

In connection with our study of the Pauli principle in Sec. 426 we 
defined a complete set of normally commuting independent observables 



532 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


as a set with the property that any pair of antisymmetric simultaneous 
eigenfunctions of the set with the same eigenvalues must be linearly 
dependent. In other words, the sets of mutually compatible eigenvalues 
of such a set of observables are all to be physically nondegenerate. 
The complete set of commuting observables bears the same relation to 
the problem of physical degeneracy as the complete set of commuting 
dynamical variables bears to the problem of mathematical degeneracy. 
If the Hamiltonian H is united with observables ai, 0 : 2 , * * * to form 
a complete set of commuting observables, the physical degeneracy of 
any energy level Ek is equal to the number of different sets of a eigen- 
values compatible with Ek. 

In view of the sjun degeneracy it is obvious that the set of independent 
commuting observabh's formed by Hh and £* is not complete. We shall 
prove that it l)ecomcs (X)inplote when we add to it. It is to be observed 
that th(i permutation group contributes nothing to the physical degener- 
acy (cf. Sec. 425, p. 338). Thus the factor N in the expression N{2L + 1) 
given on p. 317 for the number of linearly independent simultaneous 
wave functions belonging to E and L, becomes unity if we count 
only functions which satisfy the Pauli principle. On the otluT hand, 
the inclusion of the spin coordinates in the wave functions enlarges 
the rotation-^^flcH’.tion group of operators. It is possible to rotate* the 
reference ax(^s for the spin coordinates and for the spa(;e coordinates 
separately. H('nce th(^ physical degeneracy of an energy level Ek with 
the orbital angular-momentum quantum number L and the spin quantum 
number S is (*qual t(j the product of the number of wave functions 
derivable from any oik* by rotations of the. axes for the space coordinates 
and the number derivable by rotations of the axes for the spin coordinates. 

In 8(*c. 40r we saw that every rotation of the space-coordinate axes 
can be effected by an operatiir of Aie form 

where X, /x, v are direction cosines. Such an operator can commute with 
£>z only if X and m are zero, in wliich case F becomes a funct on of 

Inspection shows that the operator T(w) which transforms a spin 
function from one set of reference axes to another obtained from it by 
rotation through an angle w about the z axis is of the form 

Hi! 

T(c) - 6 ^ (63*4) 

{cf. Sec. 616). The corresponding operator for a general rotation about an 
axis whose direction cosines are X, m, v is accordingly 


i (w) = e * 


(63-6) 



Sec. 63] A FIRST APPROXIMATION 533 

Evidently the only operators of this type which can cominuie with are 
functions of 

To complete our discussion of the rotation-reflection grou}) for the 
spin variables it is natural to attempt to set up a spin operator analogous 
to the reflection operator K. An examination of the problem shows, 
however, that it is impossible to construct a spin opcu’ator whose proper- 
ties offer a satisfactory parallel to those of K. Even if it could be done 
we should expect the new operator to commute with all the rotations 
and hence to be a function of of little importance. 

It is now evident that there is no pair of observables in the group of 
the Schrodinger equation which commut(' with Iln^ 43^, but not with 
each other. In accordance with the fundamental postulate of Sec. 40c, 
p. 313, we conclude that thes(' three operators form (within tlie discrete 
spectrum of Hb) a complet(‘ set of commuting ind(q)end('nt observables 
and that the statistical weaght of an (‘luu-gy lev(4 with the orbital- and 
spin-angular-momentum quantum nurnlxu-s L and is exactly 

(2L + 1)(2S + 1) 

as indicated on p. 530. 

63b. Antisymmetric Functions and the Empirical Pauli Exclusion 
Rule. — The simplest class of eigenfunctions of the Hamiltonian //o of 
of problem A are the prodiK^ts of individual electron functions similar 
to those introduced in Eq. (57 6), except that they contain the spin 
coordinates. Each of the individual (‘lectron fun(*tions a has for its 
arguments the coordinates y', z', a of some particular electron, say 
the /cth, and is a solution of the corresponding equation 

+ K[e - F(r„)l7/ = 0 (63-6) 

[cf. Eq. (57-5)]. Solutions of this equation obtained by the usual separa- 
tion of variables are characterized by a set of four individual electron 
quantum numbers n, I, rriij m^. Let the ay dcaiote th(^ ^th of th(\se 
individual sets and let (A:|ay) denote the corresponding solution of (63-6). 
It is the product of a function of the form of defined in Eq. (57*7) 

and a spin factor Denoting a set of / vS(‘ts of four individual 

electron quantum numbers by a, we introduce 

/ 

= (l|a0(2la,) • • • (/|a/) = {k\a,) (63-7) 

as the typical product-form eigenfunction of Ho with the eigenvalue 
/ 

The energy Eo{(x) is degenerate on account, of the multiplicity of nii 
and m» values which can be assigned to any particular pair of values of n 



534 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


and L There is also degeneracy due to the equivalence of the different 
functions obtained by applying the various permutation operators to 
The latter degeneracy is eliminated, however, by the Pauli principle 
which admits at most one function obtained from <#>« by permutation 
and linear combination, viz.^ the function 9<^>a obtained from by 
application of the operator g. 

will vanish if, and only tf, the full set of 4/ quantum numbers a 
includes two or more identical individual sets of four. To prove this 
statement we note that if two sets of quantum numbers j and / are equal, 
each term of the sum 


T«/! 

T » 1 

can be paired with another term which differs from it only in the inter- 
change of the coordinates of the two electrons which in these terms 
have the same set of quantum numbers. The permutations which give 
these terms differ by a simple interchange. Hence one is odd and the 
other even. It follows that the terms are equal in magnitude and 
opposite in sign. Hence all terms cancel in pairs when two or more 
individual sets of quantum numbers are equal. If no two sets are equal, 
the permutations of are linearly independent, and the linear com- 
bination Q<l)a cannot vanish. 

We conclude that in the central-field approximation no states are 
allowed by the wave-mechanical form of the Pauli principle developed in 
Sec. 426 in which two electrons have the same set of four quantum numbers 
n, I, mi, m-s. Since has just tvH) values, this means that not more than 
two electrons can have the same set of thn'c r-.])ece-coordinate quantum 
numbers n, I, mi. Thus the wave-mechanical form of the Pauli exclusion 
principle is actually equivalent to the more '.nnpirical exclusion rule given 
in Sec. 676. 

Because? g commutes with Ho and Hi we know from Sec. iSd that in 
solving the perturbation problem A C we can deal separately with 
each eigenvalue of g. As X approaches zero each antisymmetric eig(?u- 
function of Ho + X//' i must approach an antisymmetric eigenfunction 
of Ho> The interpolation problem will establish a connection between 
3ach antisymmetric eigenfunction of problem B aiid a corresponding 
antisymmetric eigenfunction of problem A. Therefore the exclusion of 
problem A wave functions which are not antisymmetric implies the 
exclusion of the problem B wave functions associate^d with them. 

The reader will recall that in the discussion of the equivalence of 
imrticles of the Ksame species in Sec. 426 we were able to prove theoretically 
that physically admissible wave functions must be either symmetric or 



Sec. 63] 


‘A FIRST APPROXIMATION 


535 


antisymmetric with respect to all interchange permutations involving 
a single kind of particle. In order to complete the formulation of the 
Pauli principle it was necessary, however, to make the arbitrary assump- 
tion that the symmetric possibility is to be ruhd out for electrons and 
protons, physically admissible yp functions being dc'finitely antisymmetric 
with respect to inte^rchanges of these two specitvs. We are now in 
position to justify this assumption insofar as it applies to electrons. 

Symmetric eigenfumd/ions of Hu exist with any number identical 
sets of individual quantum numbers /, m/, 7n„, Consequently the use of 
symmetric wave functions would not involve an exclusion rule of the type 
demanded by the empirical facts cited in Sec. 576. Therefon' the above 
mentioned assumption is required by the exi)erim(nit. The corresponding 
proof that physically admissible wave functions must also be antisym- 
metric? with r(‘speet to proton interchanges is obtainable^ in essentially the 
same way, viz.y by a com])arison of the observ(?d (uungy levels of a multi- 
proton j^roblem — that of the H 2 molecule — with those to be expected 
from th(‘ symmetric and antisymmetric alternatives allowed by the 
gcmeral theory.^ 

Although tlui application of the? operator g to the product-form eigen- 
function of i)robleni A giv(*n in (63*7) generatc^s an antisymmetric eigen- 
furjction of the* same' probl(?m, it do(?s not take can? of the normalization 
of the new function. Let us assume that (pa is normalized to unity by the 
normalization of the? fa(?tors (k\ak). The scalar product of any pair of 
terms in g<^a resolve's itself into a product of individual scalar products 
sucii as 

Ok 

If g</>a does not vanish identically, at least two of the fac’tors in any 
term will be orthogonal to the corresponding factors of any other specific 
term. Hence every term is orthogonal to every other, and 

( 63 - 9 ) 

To obtain normalized functions which satisfy the Pauli principle we 
multiply g<^>a by \/J\ and call the result §<^>a or The operator 
9 = \/7!9 been called the antisymmetrizer by Condon and Shortley.^ 

It was first pointed out by Slater^ that the function can be written 
a.3 a determinant built up from the functions (/b|ay). Thus 

1 C/. D. M. Dennison, Proc. Roy. Soc. A116, 483 (1927). 

* E. U. Condon and G. H. Shortlby, T.A.S., p. 165. 

» J, C. Sdater, Phys. Rev. 34, 1293 (1929). 



536 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 




v7! 


(1|«.) (2|a,) 

(lia2) (2|a2) 


(lla/) (2|a/) 


(/ki) 

(fM 


(/k/) 


(6310) 


The equivalence of (63*10) and (63*8) follows directly from the definition 
of a d(*(enninant. Wave functions of this form are commonly referred 
to as Slater wa\T functions. When we express as a determinant, its 
antisymmetry with r(\spect to electron interchanges follows from the 
elementary rule that a determinant (ihanges sign whenever any two 
columns are interchanged. 

63c, Closed Shells. — A group of electrons to which the same values of 
n and I are assigned in th(^ case A approximation is said to constitute a 
“ shell.’ Th(^ electrons in any giv(ai shell are said to be equivalent. 
By the Pauli exclusion ruh^ the maximum number of electrons whi(;h can 
coexist in such a shell is ecjual to the number of different pairs of values 
of mi and trig which are conipatibk^ with 1. This number is 2{2l + 1). 
A shell containing the maximum allowed number of electrons is said to 
be closed. 

The complete set of individual sets of quantum numbers required for a 
closed shell is unicjuely dc^fined. Hence the corresponding wave binction 
$ is uniquely defined except, for the usual arbitrary phase factor. The 
/ / 

quantities Ml = Wf/c and Ms == ^ are both zero for every term 

* /j«.i 

of It follows that is a simultaneous eigenfunction of S*, 
with the eigenvalue zero for each. It can also be proved to be an eigen- 
function of S^, with the common eigenvalue zero. To establish 
this proposition we recollect that ac*cording to Sec. 63a 
g are integrals of the motion for problem B — ^and a fortiori for problem A . 
Since they commute, it is povssible to make their matrices and that of Ho 
simultaneously diagonal. Hence it must be possible to construct 
simultaneous eigenfunctions of all of these operators from ^ and other 
physically admissible wave functions for the same shell. But this 
shell does not have two linearly independent physically admissible 
wave functions. Hence ^ must be from the beginning an eigenfunction 
of and The only eigenvalue of either of these observables com- 
patible with the lack of degeneracy is zero. But the eigenvalue of 

The notation goos back to the static atomic models of G. N. Lewis and Irving 
Langmuir in which the electrons were thought of as arranged in successive symmetric 
groups, or shells, surrounding the nucleus. 



Sbo. ()3| 


A FIRST APPROXIMATION 


537 


for any given eigenfunction is tlie sum of the mean values of 
for the state in question. These mean values are all positive or z(to. 
It follows that is a simultaneous eigcmfunction of £x, with the 

(agenvalue zero for each. In the same way we see that each of tlie 

components of S yield zero when applied to Finally, it follows that 


each component of ^ reduces to zero identically and that ^ is an 
eigenfunction of ^< 3 ^ the eig(uivalue zero. 

Since the resultant orbital, sj)in, and total angular momenta of a 
c 1 os(h 1 sIh^II are all zero, we are led to expect such a shell to be spherically 
symmetric. Tliis exp(H*tation is verified by the observation that the 
operators F and T of Eqs. (63*3) and (G3-4) which rotate the ref(‘r(‘nco 
axes for the space and spin coordinates, respecti\^(‘ly, through arbitrary 
angles, both reduce to the idcmtical operator when applied to the wave 
function of a closed shell. 

In passing we note that a rotation of common reference axes for si)ace 
and spin coordinates, such as we ordinarily (unploy, is effected by the 


operator e 


;r 4 m 3 V “t* s) 


Hence it is (4ear that any eig(‘nf unction of 


with th(i eig(‘nvalue zero is s].)herically symmetric. Similarly a spin- 
free eigenfunction of is spherically symmetric. 

63d, Terms Originating in a Given Configuration. -The j^erturbation 
problem A B traces the coupling of the individual orbital angular 


momenta into a resultant ii, and that of the individual si)in angular 


momenta into a r(\sultant oS. As a consequence of this coupling the 
configurations of probhun A are resolved into grou])s of terms, each 
partially, or wholly, (characterized by the ap])ropriate values of the nvsult- 
ant orbital and spin (quantum numl)ers L and S. We hav(c now to 
consider just which term types, and how many of each, originate in any 
given configuration. 

Let us first deal with the cou])ling of two sets of (electrons with the 
resultant angular-momentum (piantum numbers Li, S\ and L 2 , S 2 , 
respectively. The problem is analogous to that of Sec. 58?) but differs 
from it on account of the inclusion of spin coordinates and the spin 
angular momenta. We suppose that the ehnetrons of each set are initially 
coupled together, so that each energy level of each set has definite 
corresponding values of L and S. The electrons of (each set when 
uncoupled will pass into problem A configurations (•haraeterized by a 
corresponding number of pairs of n,i values. We choose for our con- 
sideration the coupling of states of the two sets of such a charmiticr that 
no pair of n,? values which enterj? into the description of the configuration 
of one set is repeated in the description of the configuration of the other. 



638 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


In other words the two sets are to include no two equivalent electrons. 
By this restriction wc avoid the exclusion of any set of values of Mli, 
Msi^ Mtii ^S 2 on account of the Pauli principle. 

The argument of Sec. 586 can now be used to demonstrate that for any 
given values of Msi and Ms 2 the construction of eigenfunctions of 
from the various possible wave functions of the uncoupled system will 
yield values of L ranging from Li + down to \Li — L2I with no duplica- 
tions. For each of those values of L there will be a corresponding L 
complex of simultaneous eigenfunctions of and £«. In the same way 
linear combinations of wave functions with fixed values of L and Ml 
{ — Ml\ + Ml 2 ) but various values of Msi and Mr 2 yield simultaneous 
eigenfunctions of 8® and in which S ranges from Si + S 2 down to 
1^1 -- S2I, while Ma' ranges from — S to +S. In other words the L values 
obtained in this way are the same as those derived in Sec. 586; the 
S values are deriv('d by the sanu' rule; and all combinations of L and S 
values occur. Suppose, for example, that Ly = L2 == 1, while Sy — 14, 
and *82= 1. The coupling of th(^ two s(its of electrons yields the term 
types 2>Sf, 2Z), \S, "P, 

An immediate corollary on the above analysis is that the coupling of 
a closed sIkjII with a second group of electrons which belong to another 
shell or shells does not affect the angular monumtum of the latter group 
or the degeneracy of the (‘nergy levels. The ordinary oi)tical atomic- 
energy levels originate in configurations in which all but a few valence 
.electrons arc in closed shells. H(‘nce it is unnecessary to consider any 
but the valence elec^trons in determining the number of terms of various 
types which originate in any given configuration. 

The electrons outside the (dosed inner shells may be concentrated in 
a single outer shcdl, or distributed over several such shells. If we first 
determine the L and S values whi< h originate in the coupling of the 
electrons in each of the outer shells, the J and S values for the entire 
atom can be worked out by a succc^ssion of imaginary coupling processes 
in each of which two groups of rionequi valent electrons are united, ^.e., 
by a repetition of the process wo have just described. When only 
two groups of electrons are coupled, a given pair of initial states yields 
not more than one term of a given type, but this is not true when more 
than two shells contribute to the angular momentum. The way in 
which this happens and the method used for labeling the terms in such 
a case are best explained by an example. 

Consider three nonequivalent p electrons, say 3p, 4p, 5p, The 
coupling of the 3p and 4p electrons gives rise to terms in which L ranges 
from 0 to 2 and 8 takes on the values 0 and 1. The corresponding term 
types are ^5, ^P, In coupling the 5p electron to the other 

pair we have to consider the union of each of the above intermediate 
terms with the new electron. We obtain the following results 



Sec. 63] A FIRST APPROXIMATION 539 

3/>4p + 5p 

3p4p + bp ^S, ^P, 

^pAp 1) + bp 2P, 2i), 2F, 

3?>4p \S + 5p ‘-^P, '^P, 

3;>4/> «P + 5p 25, 2p, 27)^ 4^5^ 4p^ 

3;>4p + 5p 2p^.2jr)^ 4p^ 4p^ 4p 

In this case w(^ have as many as six different terms of the same type 
as regards L and R valuers originating in a single configuration of three 
electrons. In order to distinguish between- the different terms of the 
same type it is necessary to specify the partnit term due to the pre- 
liminary coupling of the 3;; and 4p electrons from which the given term 
issues. Thus the four different terms arc^ indicated by the symbols 
3p4p (ip) bp 2P ; 3p4p {U)) bp W ; 3p4p («P) bp 'P ; 3p4p (^D) bp ^P. 

The problem of the coupling of the equivalent electrons remains to be 
considered. In order to find the terms generated it is necessary to write 
out in detail the possible sets of mi and nia values consistent with the 
given configuration and, after eliminating those whicdi violate the Pauli 
principle, to determine by the counting process of S(‘C. 586 the character 
of the possible L and S complexes to be obtained by linear comlnnatiori 
of the wave urK^tions associated with the remaining sets. 

The procedure is again conveniently indicated by an example. We 
choose the simple case of two equivalent p electrons — a p^ configuration. 
Each wave function is then specified by a corresponding set of values of 
miij ms if mi2y m,2» We adopt the notation of Condon and Shortley,^ 
denoting the eigenvalues of the spin coordinates by plus and minus signs 
attached as superscripts to the corresponding numerical value of ^mi. 
Thus (1^ — 1~) denotes a state, or wave function, derived by application 
of the antisymmetrizer § to a product function ui(l)u 2 ( 2 ) in which 
mil = 1, msi = Eliminating all such states 

in which the two pairs of quantum numbers are the same, and arranging 
the rCvSiilting state symbols in rows and columns according to the resulting 
values of Ml == mn + mi 2 and Mm = m,i + we obtain Table III 
shown on p. 540. The maximum value of Ml is 2 and the only asso- 
ciated value of Ms is zero. This indicates a ^P state. The maximum 
value of Ms is unity and the accompanying values of Ml are +1, 0, —1. 
These indicate a state. Striking out one symbol from each square 
of the center column for the ^P state and one symbol from each square 
of each of the three central rows for the state, there remains only a 
single symbol in the center of the diagram for a state. Hi‘nce the 
configuration yields the L and S complexes for ^P, and “P tenhs 
with no wave functions left over. 

^ Condon and Shortley, T . A ^ S.y p. 169. 



540 THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 

Tables of this tyj)o can bo constructed for any shell populated with 
any allowed number of el(K‘trons, but incrc'ase in complexity with the 
I value for the shell and up to a certain point with the number of elec- 
trons in the shell. Only one wave function and one term type come 
from a (*.los(^d shell, however, and it is ('asy to sc‘,e that the complexity of 
the table of wave functions must decrease as the shell approaches com- 
pletion. Consider the case of a group of four equivalent p electrons, 
which differs from a closed p shell by the absence of two members of its 


Tablk 111 



Ms 

1 

0 

-1 


2 


(1 M ) 



1 

(1+ (V*) 


(1- 0-) 

Ml 

0 

(1' -1') 

(!' -i-)(0-'- o-)(r -1+) 

(1- -1-) 


-1 

^0' -i'-) 

(0+ -!-)(()- -l-i) 

(0- -]-) 


1 ... , 

“2 


(-1^--]-) 



full complement. There are six possibb' pairs of values of nii and 
for p el(H‘trons. Each way of ]>icking out n of the six pairs defines a 
product wave function for n (equivalent p el(‘ctrons. Every scdierne for 
picking out four pairs is also a sclnane for picking out two pairs for the 

two missing electrons. Simee the values of ^m.ik and for the 

k k 

missing electrons are equal and opposite to the values of the same 
quantities for the (dectrons actually pr<?8(mt, the table of Ml arid Ms 
values for the possible wave functions of the missing electrons is the same 
as that for the four electrons present. Consequently the terms originat- 
ing in a configuration of 4 or, more generally, n equivalent p electrons 
are the same as those originating in a configuration of 6 — 4, or 6 — n 
equivalent p electrons, as the case may be. This fact greatly simplifies 
the labor of working out the term types for partially filled shells of 
equivalent electrons.^ 

64. SELECTION RULES FOR ELECTRIC DIPOLE RADIATION 

64a. The Laporte Rule. — ^The simplest to derive of all the selection 
rules is the Laporte rule, which states that dipole-radiation transitions 

‘The reader is referred to Condon and Shortley, T.A,S.^ p. 208, for a table of 
the term types originating in shells of equivalent p, dj f electrons in all stages of 
occupancy. 






Sec. 64] SELECTION RULES 541 

occur between even states and odd states^ hut never between even states and eveUj 
or between odd states and odd. 

Consider the matrix element 

D(A,B) = 

in which \[/a and \l/s are two eigenfunctions of the Hamiltonian of a many- 

particle problem. If the integrand \I/a*D\1/b is antisymmetric wdth respect 
to the triple-reflection operator K, it means that for every point of con- 
figuration space there is another point, derived from the first by reversing 
the signs of all the space coordinates, at which the function has the^-same 
value (except for a reversal of sign. Hence the (contributions of the 
different volume (dements cancel in pairs and the sum-integral redu(?es 

to zero. But the (electric moment D is reversed in sign by K, Hence 

the product is antisymmetric if ypA and yf/s have the same sym- 

nmtry with respect to K. It follows that the matrix elements of D 
between states of the same symmetry with nespect to K are all zero, and 
that no transitions b( tween such states occur as a result of dipole radia- 
tion. This rule liolo^ equally well, whether or not we take into account 
the perturbative terms 7/i and Us in the Hamiltonian. The rule regard- 
ing transitions due tc magnetic dipole and eh'ctric quadrupole radiation 
is just the opposite, ifowever, since these quantities are invariant with 
resp(cct to K. 

64b. Selection Rulep for the Central-field Problem. — Selection rules 
for the central-field probhmi A are of int('rest because they have approxi- 
mate validity for the un.'-uniplified probhun C in the large class of cases 
in which the central-field approximation is a fairly good one. In Sec. 55 
we derived the rules Am -'0, ±1; Al = ±1 f(m dipole radiation in the 
two-particle central-field problem. We shall now prove that thi^se 
rules are directly applicable to many-electron atoms in the case A 
approximation. 

Let = Q<t>a and = £<t >0 denote two normalized central-field 
wave functions of the kind defined by Eqs. (63*8) and (63T0). The 
matrix element 

(644) 

expands into the double sum 

/! /! 

2)*(a,/3) = 1 )'^+'’m(Z)*Px<;>u, Pm*#--)- 

Each term of this sum is invariant of a permutation of the electron coordi- 



542 


THEORY OF STRUCTURE OF COMPLEX ATOM» [Chap. XIV 


nates in the integrand. Moreover, D* is independent of every permuta- 
tion. We can ac^cordingly replace (DxP\(t>pj Pft<t>a) by 
where P,r^ is the inverse of P^. The products Pp“^Px are, of course, 
members of the permutation group. The set of products obtained by 
holding either X or /i fast and allowing the other to range through its 
allowed set of /! values contains all members of the group with no duplica- 
tions. Hence every permutation Pr is contained exactly /! times in the 
complete two-dimensional array of products P,:~P\ involved in the 
above expression for Dx{oLy0). Furthermore, it is easy to see that if 
Pm“‘Px = Pry it follows that ( — = ( — l)^v. Hence 

n 

|3) = 

T=»l 

/ /! 

= c S 2 ( " 4.a). (64-2) 

*=1 r-1 

Due to the orthogonality of every pair of functions (A;|ay), {h\aj>) for 
which the two sets of quantum numbers a,, a,- are not equal, the scalar 
product (XkPT<l>fit <#>«) will vanish if any factor of </>«, except the one 
eontaining the coordinates of the kih electron, differs from the correspond- 
ing factor oi Pr<l)fi (we take corresponding factors to be those involving 
the coordinates of the same electron). This means that, if />*(«, jS) is not 
to vanish, each of the individual sets of four quantum numbers n, ly miy rris 
used in defining 0 a, with one (exception, must be paired with an identical 
set used in defining 0 ^. In other words it must be possible to transform 
0/3 by means of a properly chosen permutation into a form which is 
identical with that of 0 « except for a single factor. A transition between 
states 4>a and ^ 0 y which differ in this way only, is called a single-electron 
jump. Since the result holds also h^r the elements of and D^, we 
conclude that in the central-field approximation A the only allowed transi- 
tions are those of the single-electron type. 

We now assume that the above condition is satisfied. If (xkPT<t> 0 y <t>a) 
is not to vanish, k must be the ordinal number of that factor in 0 a which 
does not match the corresponding factor of Pr 0 / 9 . Also the per- 
mutation Pr is uniquely determined Hence the sum on the right of 
(64*2) reduces to a single term. By proper choice of the order of the 
individual sets of quantum numbers 6 i, ^ 2 , * * * , 6 / which define 0 ^, 
we can insure that Pr shall reduce to the identical permutation. Let 
us assume that the required choice has been made. Equation (64-2) 
becomes 

Dziayfi) = e(x '{k\hk)y (A;|a*)). 

Thus the matrix element of D* (or Dy, or Dg) for an allowed problem-A 
transition is identical with the corresponding element of the matrix ^or 



Sec. 64] 


SELECTION RULES 


543 


a single electron. It follows that in the central-field approximation 
the selection rules and intenstiies for the niany-electron atom are the same 
as for the single-electron problem with the same potential-energy function^ 
viz., 

Aifc = ±1, Amu- = 0, ±1. 

This conclusion substantiates thc^ assumption of Sec. 57h, p. 488. 

64c. Selection Rules for Problems B and C.- "Selection rules for 
{problem B are of interest for the same reason as those for the central- 
field problem, i.e., because of their approximate validity for the unmodi- 
fied problem C. They are of course more widely applicable to actual 
atoms than the central-field rules because the Hamiltonian of B is better 
than that of A . In order to work out the rules for problem B we have to 
df^termino whi(ih matrix elements of the electric moment vanish when 
that matrix is worked out from a schenn^ of wave fuiK^tions whi(di makes 
and diagonal. It is convenient to assunui that Lg, Sz are also 
diagonal, giviiig what is known as an IjM^SMs scheme of wave functions. 

The first sel(*ction rule to be considered is that which forbids the 
occurrence of intersystem combination lines, i.e., transitions involving 
changes in the quantum number S. For this purpose it suflic(\s to noti^ 
that the product of any eigenfunction of with a fum^tion of the space 
coordinates only, such as Dz, Dy, Dzy is also an (agenfunction of with 

the same eigenvalue as the original function. It follows that 
vanishes if \1/a and ypB are eigenfunctions of with different values of S. 
This proves tin* selection rule, which holds very well experimentally for 
the lighter atoms. 

The selection rul(\s for Ml and L are 

AMl = 0,±1, (64*3) 

AL = 0, ±1. (64-4) 

The latter rule was used in Sec. 58a, p. 493. Closely related to the 
above, and derivable by the same procedure, are the rules 

AM = 0, ±1, (64*5) 

AJ=0, ±1, (64-6) 

for problem C. 

' These rules are most readily deduced by algebraic methods like 
those used in Sec. 616 for the determination of the angular-momentum 

matrices.^ Let £k and Dk denote respectively the vector operators 
for the orbital angular momentum and the electric moment of the kth 
electron. As a starting point we work out the Poisson brackets for the 

1 The first quantum-mechanical proof of these selection principles was the matrix 
derivation by M. Bom, W. Heisenberg, and P. Jordan, Zeits. f. Physik 35, 557 (1925). 



544 THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap, XIV 

three components of paired in turn with the three components of Dk, 
These can be obtained by direct partial differentiation, or by application 
of the commutation rules (37*10) for the Cartesian coordinates and 
conjugate components of linear momentum. 

~ 0 , [£>ku)^kzl “ I^kzj ~ I^kyy ^ 

[•^kzfJ^ky] = J^kzj = 0, [.^iczjJ^ky] — J^kxj^ (64*7) 

[StkxyJ^k^ = "-Dkyy [i3A*//,/)Aj = Dkxy [£>kz,Dkz] =0. ) 

Since all the coordinat(\s and momenta of any one electron commute 
with all the coordinates and momenta of any other, the equations are 
equally valid if we drof) the subscrii)t k throughout. 

Let D and W denote the adjoint complex components of the electric 
moment defined by 

^ / / 

D = Dx + iDy = ^ ek(xk + iyk), ^ — iDy = ^ Ckixk — iijk), 

k^l k^l 

(64-8) 

as in Sec. 556. From Eqs. (64*7), and (64*8) it follows that 

= iD, [£.,5t] = (64*9) 

Each of the above operator equations implies a corresponding matrix 
equation. Thus 

[£.,D] - ?:D, l4^.,Dt] = -iDt. (64.10) 

We assume a matrix scheme based on simultaneous eigenfun(;tions 

of Siz and such additional independent commuting observables /z, as are 

needed to make a complete set. As in Sec. 616, p. 514, we number the 
various sets of mutually compatible eigenvalues of th(^ /z's and call the 
general ordinal number r. The typical muOrix elements of D and £>z take 

the respective forms , r",M l ") and M Equations 

(64*10) yield 

I>(t',M//; r'\Mn{Mjr - Ma/ - 1) = 0, (64*11) 

Dt(T', Ml') t",Ml"){Ml" - Ml' + 1) - 0. (64*12) 

Similarly, the relation [£*,!)*] = 0 yields 

Dz{r\ML'] r'',ML"){ML" - Ml') = 0. (64*13) 

From these three equations it follows that all the elements of all three 
matrices D», Dy, D, vanish in the above scheme, with the exception of 
those for which 

AJIfL = Ml" - 


Ml' = 0, ±1. 



Sec. 64] 


SELECTION RULES 


545 


As emphasized at the beginniiipi; of Sec. 54/, the basic formulas (54*8), 
(54*9), and (54*10) for the transition probabilities due to dipole radiation 

assume that the vector electric-moment matrix D is to bo computed 
from a complete orthonormal system of oig(nifunctions of the atomic 
Hamiltonian. If the atomic Hamiltonian commutes with there 
is a complete orthonormal system of simultaneous eigenfunctions of H 
and Juz, When the transition j)robabiIities are comput(^d from tlu' matrix 

of D in this s(;heme, there are no contributions from pairs of states which 
do not conform to the selection rule (64*3). 

Since all the sjnn operators coinnuit(‘ with all compotu'nts of £, Eqs. 

(64*7) (64*9), and (64*10) are equally valid if we substitute for £ the total 

angular momentum £> + S, or Hence tlu^ niatricc^s D^y, D- in any 
s(!heme W'hich makers Jz diagonal will have no nonvanishing eleunents 
which do not obey the selection rul(‘ (64*5). 

There are two types of application of th(' s(^l(‘ction rules for M l and ilf, 
viz.j applications to free atomic sysbans with sj:)herical symmetry, and 
applications to atomic systems with axial symmetry only. In the former 
case the selection rule for Ml in f)roblem B and the sel(‘ction rule for M in 
problem C (or B) ar(‘ of use in working out the transition probabilities 
and intensities of different spectrum lines, but are not directly reflected 
in the elimination of corresponding lines because or as the cas(^ 
may be, is not a function of the energy, and pairs of states wdth different 
values of AM/., or AM, contribute to th(» same spectrum line. 

Consider, next, applications to atomic systems with axial symmetry 
only. The fixed-nuclei i)robl(nn for a diatomic molecule (c/. Sec. 52a) 
and the Zeeman and Stark effects afford exam pi (\s of this typt^ of system 
in which the z component of angular momentum, say Zz, becomes an 
integral of the motion when the z axis is identified with the axis of 
symmetry, although the peri)endicular components are not integrals. 
In such a case every integral of the motion commutes wdth Zz'^ {Rx and 
Ryy which do not commute wdth Zz, may be integrals) and Zz^ becomes a 
function of the energy. We can accordingly label the energy levels with 
corresponding values of Ml^ or \Ml\. The selection rule (64*3) is then 
directly reflected in the spectrum through the elimination of lines corre- 
sponding to changes in |Ml| which have absolute values greater than unity. 

In order to derive the selection rules (64*4) and (64*6) we shall employ 
a procedure first used by Dirac ^ in proving the selection rule for the 
two-particle problem of Sec. 55b and later slightly modified by Gtittinger 
and Pauli® to cover the general case of any atomic system with spherical 

1 P. A. M. Dirac, Proc. Roy, Soc. Alll, 281 (1926); P.Q.M., §47. 

*P. GtlTTiNGBR and W. Pauli, Zdis, /. Phydk 67, 743 (1931); Conpon and 
Shortlet, T. a. S, p. 60. 



546 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


symmetry. From Eqs. (64-7) it follows that * 

[£^,D,] = + l£y\D.] = £„D, + Djty - {£j)y + Dy£,) 

= 2|(£,Z)* - £j)y) - 

== 2{£ij/Dx I^y£>x) ~ 2{T)x£>y £/zI^y)» 

By cyclic advancement of the subscripts: 

[£^,Dx] = 2(Ayii. - = 2(D.«e. - 

Hence 

[£^[£^D.]] = 2[£^ £yD. - £jJy - 

- 2£„l£^/).l - 2£J£^Z)„J - 2(A)|£^/>.] 

= 4£yil)y£, - £„D,) - 4£^{£JJ, - Djt,) + 2(£2D, - 7)^2) 

= 4(£ • /))£, - 2(£“/^ + IK£-). 

Thus 


[£M£2,7)]] = 4(£ ■ D)£ - 2{£W + D£-). (64-14) 

By direct expansion of the left-hand member, 

£*D - 2£W£^ + D£* = -4^|^y(£ • D)£ + 2(^^\£^D -|- D£^). 

(64-15) 


Guttinger and Pauli observe that 43 • D (»oninuites with each component 

of 43 and therefore with 43^. It follows that (43 • D)£ commutes with 43^. 
Hence the matrix of this quantity in a scheme which makes diagonal 
will have no nonvanishing elements whj< h are off-diagonal with respect 
to L. Let us form the matrix element t",L") of (6415) in such a 

scheme. We again employ r to designate the ordinal number of a 
set of eigenvalues of additional independent commuting observables 
needed for a complete set. Let us agree that L' L" in order to elim- 
inate the last term of the equation. Then 

(^\{L'y(L' A- 1)^ - 2L'{L' + l)L"{L" -f- 1) + {L"y{L" + 1)»] 

X B(r',L'; t"^") = 2f^\u{U + 1)+ L"{L" -f l)]B(r',L'; 

Tliis equation is reducible^ to the form 

[(L' + L" + 1)2- !][(// - L")2 - iP(r',L'; r"^") = 0. (64*16) 

1 C/. references in footnote 2, p. 545. 



Sec. 66] THE HELIUM ATOM 547 

Since L' 7 ^ L" and neither L' nor L" can be negative, the first factor 

cannot vanish. Consequently 5(t',L'; r",L") must vanish unless 

L' — -L" = ±1. We conclude that in general is zero 

unless L — i" = 0, + 1 . It follows that the selection rule (64*4) 
naust hold for any atomic system if £2 integral of the motion. 

Similarly (64*6) must hold if ^2 integral of the motion. 

In the case of the two-partieh^ problem without electron spin the 

scalar product £ • 1 ) vanishes automatically and Kq. (64*16) holds even 
when L and IJ' are equal. Iransitions for which AZ/ == 0 are accordingly 
ruled out when Jj and do not both vanish. A general rule to be 
derived in the next paragraph forbids transitions of the latter type. 
Consequently the selection rule for the angular quantum number in this 
case becomes 

AL = AZ = ±1, 

as proved in Sec. 556 l)y another method. 

In concluding this section we derive rules forbidding transitions 
between two states for each of which £2 (or ^ 02 ) oigenvalue zero. 

These rules originate in the fact that />x, are eigenfunctions of £2 

and ^2 with the eigenvalue 2(6/27r)2, t.c., with the quantum number L 
(or J) set equal to unity. Thus 
/ / 

(£*2 + £«=“ + £.=‘)2«='* = 2 

A: = l 

/ ^ / 

= - StkzVk) = (1 + 

= l ' k^l 

The product of the wave functions of two states for each of which £2 has 
the eigenvalue zero is also an eigenfunction of £2 with the eigenvalue 

zero. This product is consequently orthogonal to each component of D. 
This proves the rule for dipole radiation. The ^j 2 rule follows in the 
same way. In fact it is possible by an argument of the above type to 
proven that these rules hold for electric quadrupole radiation as well. 

66. THE HELIUM ATOM AND “EXCHANGE ENERGY” 

66a. Two-electron Atoms. — The work of the previous section com- 
pletes the theoretical derivation of the empirical spectroscopic rules 
formulated in Secs. 56, 57, and 58a. We shall now turn our attention 
to the simplcvst example of a many-electron atomic system with Russell- 
Saunders coupling. This is the case of a nucleus of charge Ze with two 
‘^planetary electrons, z.c., the problem of the neutral helium atom and 



548 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


of the holium-liko ions of larger atomic number, such as Li"^, etc. 

It was an investigation of this problem by Heisenberg^ which first revealed 
the electrostatic nature of the energy diff('reric(‘s between terms of 
different multiplicity with the same angular momentum and coming 
from the same configuration. The investigation also led to the discovery 
of the restriction of physit^ally admissible wave functions by the exclusion 
of those which are not antisymmetric with respect to electron inter- 
changes (wave-mechanical form of the Pauli principle). 

For our present purpose it is (convenient to use symbols for the 
individual-ele(ctron wave functions in the central-fi(cld approximation 
which indicate their form more explicitly than those adopted in Sec. 636. 
Explicit indications of the quantum numbers mi and are needed. 
The quantum numbers n, I can be ncplaced by a single index integer 
r. The notation of S(cc. 61a will be used for the spin functions of the 
individual electrons. There are eight orthogonal (central-field product 
wave functions for a giv(cn pair of sets of values for the orbital (quantum 
numbers n, i, mi. Four of thes(‘ an^ derival)le from the other four by 
application of the int(‘rchang(' ])ermutation Pi 2 . The four primary 
product functions can b(c des(cribed by the symbols 

<i>i = (l|r,m0af(l)(2|jr',m/)a(2), 

</)2 = (l|r,mz)/^(l)(2|T',m/)/3(2), .. 

0, = (l|r,m,)o(l)(2|r>/)^(2), 

<t>4 == (lir,m/)/?(l)(2|T',m/)a(2). 

9 transforms <^ 2 , <t>z, <#>4 into the following four independent antisym- 
metric functions 

g<#.j = ^_[(l|r,m0a(l)(2lT',m/)a(2) - (2|T,m0a(2)(l|r',m/)«(l)], 

§<^2 = — [(llr,m0^^(l)(2ir^m,')^(2) - (2|r,m,)^{(2)(l|r',m/);8(l)], 

9«/>3 = ;^t(llr,wO«(l)(2ir',>H,')(8(2) - (2lr,m,)a(2)(l|r',m/)0(l)], 

§«4 = ~[(l|r,»n,)|3(l)(2lT>,')a(2) - (2|r,mO^(2)(llT',m/)«(l)]. 

Two of these functions can be resolved at once into products of orbital 
and spin functions. Thus 

- (2km,)(l|r',m,')]o(l)o(2), 

. f (65-2) 

$2 = g <>2 = -^[ (1 |t, mi) (21/, niiO — (2|T,mi)(l|/,mi')]/3(l)/8(2). 

1 W. Hbisbnbebq, Zeits. f. Physik 38, 411 (1926), 38, 499 (1926), 41, 239 (1927). 



Sec. 65] 


THh: HELIUM ATOM 


549 


By addition, subtraction, and renormalization one obtains two other 
antisymmetric functions which factor in th(^ same way, viz., 

^3 = + §<#> 4 ) = ^l(l|T,mz)(2|r',m/) ~ (2\r,mi)(l\T\m/)\ 

X [«(1)^(2) -f ma{2)l 
^4 == - §<^> 4 ) = ^l(l\T,mi){2\T\m/) + (2lr,m/)(l|T',m/)] 

X k(l)0(2) - /3(l)a(2)l. 

4>4 is symmetric with respect to a permutation of the space coordinates but 
antisymmetric in the sinn coordinates, whiles the other three functions ar(‘ 
antisymmetric in thc^ spa(‘e coordinat(\s and symmetric in the spin 
coordinates. 4>4 is readily s(^(mi to ho an (ugenfunction of S- with the 
eigenvalue 0, while < 1 ^ 1 , ^> 2 , are eigenfunctions of with the common 
eigenvalue 2{h/2Tr)-. Thus ^4 belongs to the singlet system (S = 0), 
while the other three belong to the trij)let system (S = 1). All four 
of the 4>\s are eigenfunctions of and as well, wdth quantum numbers 
obtainable by inspection and indicatc^d in Table IV. ^ Thus in this 
case the process of antisymmetrizing and factoring gives a set of w^ave 
functions which make L. diagonal. 



TAin.K TV 



1>i 



^4 


s 

1 

1 

1 

0 


Ms 

1 

-1 

0 

0 


Ml 

rtti + mi 

mi 4 - mi 

mi -f- mi 

nil -f mi 


If possible, w^e sliould like to b(^ diagojial also, sim^e the order of 
the secular determinants to be solved wmuld. then be reduced to a mini- 
mum. In fact it is easy to se(^ that the order of the sc^cular d(‘terminants 
would thereby be reduced to unity. W(‘ know from S(^c. 63d that the 
coupling of two electrons prorluces only one term of a given type originat- 
ing in a given initial configuration. Hence all simultaneous eigenfunc- 
tions of £2, S^, £>z, coming from a common configuration and having 
the same set of values of L, S, Ml, Ms are linearly dependent. It 
follows that by dealing with one such set at a time we reduce the order 
of the secular determinant to unity, as was to be proved. In other 

1 From the equality of the space factors of <l>i, ^ 2 , w(3 can at once verify the 

theorem that the energy is inciepenclent of Ms- The final zero-order wav(3 f\iiK*tions 
will be made up of linear combinations of 4>’s with the same S, Ms, Ml- The space 
factors of the with given S and Ml are independent of Ms- Hence the secular 
equation (Us neglected) and the energies, which arc its roots, are independent of Ms- 




550 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


words the first-order approximation to the energy of a problem-fi level 
with given L,S values coming from a given configuration is simply the 
mean value of the Hamiltonian Hb = //o + for the unique correspond- 
ing case A wave function. 

Although not all the ^'s are eigenfunctions of there is a very 
important class of 4>’s which are, triz.j those which come from configura- 
tions in which one electron is in an s state with azimuthal quantum 
number zero. Suppose, for example, that 

{l\r,mi) = {\\7iyl,7ni) = (l|n,0,0) ^ v(l), 

(2|r',m/) = (2ln',r,m/) - w{2). 

Then 

•CVx = [(£ 1 * + + (<^U + c132z)^1*^(1)i^^(2) 

X Sf)in factor = Z'(/' + 1)^^^ ^ 2, 3, 4. 

^ is also seen to be an eigenfunction of with the eigenvalue V. The 
normal state of helium and the helium-lik(? ions, togetlier with all states 
in which only one electron is excited, l)el()ngs to the class under con- 
sideration. Since all the more familiar optical energy levels are of this 
type, we can exclude other cases from consideration without great 
practical loss of generality. 

Let and denote respectively the central-field energy of a 
configuration and the first-order energy correction for an associated 4>x. 
It follows from the above that their sum, say is given by 

^^,(041) == = ([Ho + //ij4^,4>x). (65*5) 

Summing over the different pairs of values of the spin variables we reduce 
the scalar product to the form 

E^u>+i) * J • • • J^[t;(l)w(2) ± w>(l)«(2)J*//«[t>(l)w(2) 

± w(lM2)]dT,dr2, (65-6) 

where the negative signs go with tiic values 1, 2, 3 for X and the positive 
signs go with the value 4. Subtracting off the central-field energy, we 
have 

= /i + /2. (66-7) 

/i “ 2^ • • • J , (65-8) 

/s = 2j* fw(l)*v(2)*Hiv(l)w(2)dridTi. (65-9) 

In first approximation 7i gives the difference between the central-field 
energy and. the mean of the singlet and triplet energies. Js determines 


(65-4) 



Sec. 65] 


THE HELIUM ATOM 


551 


the spacing of the corresponding singlet and triplet levels. It is the 
analogue for the helium problem of the integral 1^ met with in the study 
of the H 2 molecule problem in Sec. 526 and is accordingly referred to as 
an exchange integral. 

In the special case under consideration (62-10) degenerates into 




ri2 


7pi 

™ + ^ + Vin) 4- V(r,) . 
1112 


We have to decide what form to give V(r) before we can proceed farther. 
Ordinarily in an atomic computation it is ne(?essary to apply the Hartree 
self-consistent field method in order to dett^rnine a satisfactory central- 
field potential function V. If wc exclude states for which V is zero, 
however, we can adopt a simple suggestion due to Heisenberg which 
avoids this complication. Tho suggestion is, in effect, to make two 

choices of F(r), viz.^ Fi(r) = , and V 2 ir) = using 

one for the inner electron and the other for the outer electron. The 
basis for this procedure is given by the small values of the (|uantum 
defect for the priiicipal-s(^ries eru'rgy levels (0.06 for the triplet levels 
and —0.01 for the singlet levels) indicating a small interpenetration of 
the wave functions of th(^ inner ek^ctron and the outer electron when the 
latter has an azimuthal quantum number not less than unity. Ni^giecting 
interpenetration entirely, the inner ek^ctron would be subject, on the 
average, to the field of the bare nucleus, while the outer electron would 
be subject, on the average, to the field obtained by condensing the innei 
one upon the nucleus. Consequently a good product-form approximate' 
solution of the Hamiltonian II a should be ”;5(l)it?^_i(2), where Vz is a 
Is function for the bare nucleus, ajid is an n', Z', m/ function for 
the nucleus of atomic number Z — 1. Using thc^se initial approximations 
we can form functions 4>i, 4>2, 4^.^, 4>4 by permutation and linear combina- 
tion, as before, but they are not then eigenfunctions of any Ho. In 
fact vz(l)wz^i{2) is an eigenfunction of 


Tf - (V 2 


(65-10^ 


while the permuted function Vz{2)wz-.x{X) is an eigenfunction of 


= - 


SirV 


(Vl=* + V2») 


(Z l)e--“ 


ri 


r2 ’ 


(65-11) 


A linear combination is an eigenfunction of neither Hn nor JETos- This 
fact takes the calculation out of the domain of the conventional perturba- 
tion theory of Sec. 48 and puts it into the domain ofthe variationa l method 
of Sec. 51. 



552 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


The procedure is the same as for the Heitler-London calculation of the 
energy of the Hs molecule (Sec. 526). Equation (65-6) is still applicable, 
but its reduction takes a slightly different form. We have 


Hbv{1)w{2) = Hnv{\)w{2) + //,.Kl)w(2) = [A’'®' + Hn]v{\)w{2), 




(65-12) 


H„vi2)wil) = + //i2]y(2)w(l), 

(65-13) 

where 

II,, Hn = — - 

ri2 T2 Ti2 ri 

(65-14) 


The reduction of (65-6) with the above yields 

'= J • • • J F„|e(l)l V.(2)p./r,dr. + 

/ /. //,2lK2)|Xl)|^dridr2± f •••j* lhiv{\)v{2)*w{l)w{2)*dTidri 

f. /h2v(2)v(l)*w(2)w(l) *dri(ir2. 


The first integral is equal to the second and the third to the fourth. Let 

// == 2/ • • • J//n|Kl)n'^K2)|Vr,dr2; \ 

= 2J • • • J7/nv(l)v(2)*w(l)w(2)*dT,(/T2j ^ ^ 


Then 


J57,(i ) = ^ /,' (X = 1, 2, 3); = 1/ + //. (65*16) 

The first-order approximates energy determination is thus reduced 
to the problem of working out two sextuple integrals. The reader is 
referred to Heisenberg^s paper^ for the dc'tails of the integration. The 
most important result of this work is the conclusion that the exchange 
integral I 2 ' is positive. It follows that singlet-system energy levels 
should lie above the corresponding triplet levels in accordance with 
experiment. The calculated differences between corresponding singlet 
and triplet energies are in fair agreement with experiment considering 
the roughness of the computation. Thus the computed difference 
between the quantum defects of the term 2p and the term Is 2p 
checks with the observed difference to within 20 per cent. The agree- 
ment for the Is 3p and 1$ 3p ^P terms is within 15 per cent, while that 
for the corresponding terms of the Li+ ion is within 10 per cent. 

The calculation of the absolute positions of the terms is not at all 
good but becomes reasonably satisfactory when the first-order corrections 
are supplemented by those of second order. 

The helium-atom problem is fundamentally of the same type as the 
fixed-nuclei hydrogen-molecule problem, since in each case we have to do 
with two electrons in a fixed external force field. In dealing with H 2 

^ C/. second reference, footnote 1, p. 548. 



Sec. 66] 


THE HELIUM ATOM’ 


553 


in See. 52 we used spin-free wave functions. This procedure is legitimate 
when there are not more than two electrons, and the spin-orbit energy Us 
is omitted from the Hamiltonian, because the orbital wave functions 
derived from the spin-free tlnwy can be antisyrnmetrized by the intro- 
duction of properly chosen spin factors. Thus, since the orbital wave 
function of (52*4) is symmetric with respect to an interchange of 
the space coordinates, it suffices to add the antisymmetric spin factor 
a(l)i8(2) — /3(1)q;(2) in order to satisfy the Pauli principle. Similarly the 
antisymmetric orbital function ip" can be correlated with any one of the 
three symmetric spin factors a(l)a(2), iS(l)jS(2), a(l)^{2) + |S(1)q:(2) to 
satisfy the Pauli principle. Hence the energy level E' is of the singlet- 
system type, while E” belongs to the triplet type. In this case the order 
of the levels is opposite to that for helium, the singlet level being lower, 
Ix^cause the exchange int('gral has the opposite sign. 

65b. The Exchange Phenomenon. — The ^'cause” of the energy 
difference of corresponding singlet and triplet terms in both of these two- 
electron problems is to be found in (a) the permutation degeneracy of 
the zero-order j)roduct-form wave functions and in (h) the removal of the 
permutation operator from the list of integrals of the motion when the 
e^/ri 2 term in the potential energy is introduced. From the wave i)oint 
of view w(‘ have to do with the splitting of a degenerate vibration fre- 
quency when the symmetry which produced tlu^ degeneracy is removed. 
This is entirely analogous to the frequency splitting which accomi)anies 
the coupling of two resonating (dectrical circuits. In the el(xd>rical 
(^ase the coupling results in a secular surging of the energy from one 
circuit into the other and back again when the coupled system is set 
into oscillation by the excitation of one circuit only. The frequency of 
these surge oscillations is e(iual to the difference between the two possible 
normal vibration frequencies of the coupled system. Essentially the 
sam(? phenomenon is possible in these atomic problems. 

Let ypi and \l /2 be eigenfunctions of the Hamiltonian of a two-electron 
atomic system belonging to neighboring energy levels Ei and E^j respec- 
tively. Let the system be started off in a subjective state with the wave 
function (^i + ^ 2 )/V^- The corresponding solution of the second 
Schrodinger equation is 

T 1 r I L ; 

^ = —TsL'Aie * + ^2e * J- 

V2 

A simple reduction gives 

1 2T(Er-Et\, 

^ ^ V 2 / I (^1 + f j) cos 2 V 

+ - lAi) sin i — 



554 THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 

It followB that at times which are cmm multiples of h/2{Ei — E*i) the 
probability density redu(‘.es to |^i + ^2l“/2, while at times v/hich 
are odd multiples of h/2{Ei — E^) the probability density reduces to 
1^1 — ^2|V2. Thus the system oscillates back and forth with the beat 
frequency (Ei — £2) /A from the state {ypi + ^2)/V^ to the state 

- ^2)/V2. 

In the problem B approximation these beats lead to a periodic 
exchange of roles on the part of the two electrons provided that we 
arbitrarily omit the spin coordinates from the wave functions and treat 
the problem as w^e treated the fixed-nuclei H2 molecule problem in Sec. 52. 
(The spin-free wave functions permit the electrons to have diffe^rent roles 
because they are not necessarily either symmetric or antisymmetric 
with respect to an interchange of coordinates, although they are eigen- 
functions of Hb with the eigenvalues allowed by the Pauli principle.) 

Let us assume, for example, that \pi is the product of a symmetric 
functitm of the spatial coordinates and an antisymmetric function 
of the spin coordinates, whereas ^2 is the product of an antisymmetric 
space function ^l2 and a symmetric spin function. Dropping the spin 
factors, we assume the initial wave function (ui + u^^j\/2 and apply 
(65-17). At times which are even multiples of A/2(£'i — JE'o) the p)rob- 
ability density takes the form \ui + ^2^/2^ while at times which are 
odd multiples of the same unit, takes the form |?/i — ^^2r‘^/2. But the 
permutation operator Pl 2 converts {ui -|- U2)/\/2 into (wi — U2)/y/2 
and — u^ly/2 into {u\ +'W‘2 )/a/ 2. Thus the passage of the time 
interval h/2{E\ — E2) is equivalent in c ffect to the application of the per- 
mutation P12 and can he said to interchange the roles of the two electrons 
in the wave function. This exchange of plaices becjomes more concrete 
and vivid if we maki^ use of the case A approximations 

—^[t;(l)i/;(2) +^^(l)?K2)i and 
V2 

of Eq. (65-6) for u\ and 1/2, res|>cctively. With the identification, 

+ xii) = v{l)w{2), :^(“i - ««) = «’(1)»(2). 

Thus the resonance phenomenon results in a periodic exchange of the 
identities of the electrons associated with the central-field states indicated 
by V and w. 

In the case of the molecule problem of Sec. 526 we can identify U\ 
and U2 in first approximation with the functions and of Eq. (52-4). 
Equation (65-17) then leads to a periodic exchange in the identities of 
the electrons associated with the two nuclei. It should be clearly under- 
stood, however, that these exchange ‘‘pictures’^ cannot be formulated in 


■^b(l)w(2) - «)(l>(2)j 



Sec. 66] 


THE DIAGONAL SUM METHOD 


555 


terms of the complete antisymmetric wave functions with spin argu- 
ments included. Moreover, it is certainly wrong to think of the periodic 
exchange of places of the electrons as the cause of the resonance-energy 
difference E\ — £' 2 . 

66. DIAGONAL SUMS AND THE PROBLEM B ENERGY LEVELS 

The procedure used in Sec. 65 for securing zero-order (central-field) 
antisymmetric wave functions which diagonalize L- and is a very 
special one applicable only to the particular problem there considered. 
(It is impossible to set up complete antisymmetric wave functions which 
are the products of space and spin factors if there are more than two 
elec^trons in the system. There are only two kinds of individual-electron 
spin functions and hence it is not possible to make an antisymmetric 
spin function involving more than two electrons.) 

• There are a numlx'r of more general procedures, however, for locating 
th(' energy leA^els of problem B in first-order approximation. The first 
and most obvious of these is the following. Let the Slater wave functions 
4>a = g</>« for the (U)nfiguration under consideration be listed. For a 
given pair of values of Ml and Ma there will be a finite number of func- 
tions, say * • • , The finite energy 

matrix bas(*d on these functions, can tluui be worked out and 

diagonalized by solving the corre^sponding secular equation (48*7). 

In practice the calculation is greatly simplified by an observation 
due to Slater {loc. cit., footnote 3, p. 535). We. introdu(*o the modified 
procedure by the consideration of the special case where there are two 
equivalent p electrons outside the closed shells. Table III, p. 540, shows 
the possible Ml and Ms values with the designations of the Slater wave 
functions for each. 

Suppose we begin with the square of the table for which Ml = 2, 
Ms = 0. There is a single Slater wave function, which must accordingly 
be a simultaneous eigfuif unction of and It belongs to the '£) term 
as previously noted. The corresponding secular determinant is of the 
first order and tlu^ energy is given by the diagonal element 

Similarly the columns of the table for Ml == 1 and Ml = — 1 have one 
wave function in each square. The mean value of Hi for any of these 
functions gives the energy of the state. When this is done there 
remains only the state to be calculated. The first-order energy 
correction for this state is one of the roots of the secular determinant 
for the three Slater wave functions 

<|»,(o.o) (l+0“), - (0+6~), = (1- ~1+), 

the other two roots being the previously computed first-order energy 



556 


THEORY OF STRUCTURE OF COMPLEX ATOMS [Chap. XIV 


corrections of the H) and states. Fortunately it is unnecessary to 
compute the off-diagonal elements of the matrix fQj. square 

of the table, since the sum of the diagonal terms of a finite square matrix 
is invariant of a canonical transformation (c/. Sec. 446). Hence the 
sum of the diagonal elements of is equal to the sum of the root? 

of the equation. [From elementary algebra we know that the sum of the 
roots of an algebraic equation of the form 

- + * * • + An = 0 

is —^ 1 . The expansion of the secular determinant shows that this 
3 

coefficient is Therefore the energy correction for the 

k^\ 

state is obtainable by subtracting off the values of for the and 
states from the diagonal sum in question. Thus the kx^ation of the 
energy levels for the special case under consideration is reduced in first 
approximation to the calculation of the diagonal terms of the energy 
matrix based on Slater wave functions. 

In most cases the application of the procedure sketched above suffices 
to eliminate the necessity of computing any off-diagonal matrix elements. 
However, if there are two or more terms of the same typo originating in 
the same configuration, it is necessary to work out some off-diagonal 
elements. In this case every secular equation involving one of the terms 
will involve the other as well. Oonsciquently it Ix^comes necessary either 
to evaluate the off-diagonal matrix elements of Hi needed for the solution 
of one of these secular (Hpiations, or to solve the simpler secular equation 
which takes its place when the Slater wave functions are replaced by 
others which diagonalize and 

For more complete and detailed information regarding the theory of 
atomic spectra the reader is again refeiTf^d to the comprehensive work 
on that subject by Condon and Shortlej^, 



APPENDIX A 


THE CALCULUS OF VARIATIONS AND THE PRINCIPLE OF LEAST 

ACTION^ 

The Fundamental Problem. — The calculus of variations deals with a 
generalization of the ordinary problem of maxima and minima of a 
function of one or more independent variables. This generalization 
permits the reformulation of problems involving differential equations 
as problems in maxima and minima. 

The fundamental problem of the calculus of variations in its simplest 
form is the following. Let F{x^y,t) denote a continuous and twice 
differentiable function of the three arguments x, 2 /, t- Let x be a con- 
tinuous differentiable function of t, and let y be identified with the 
derivative of x with respect to t which we denote by i. Then F{XfXjt) 
becomes an implicit function of the single argument t, although explicitly 
a function of the three variables x, x^ i. Consider the integral 

J[x] = r'F(x,x,t)dt. (A-1) 

Jto 

J depends on the form of the function xit) but is not a function of x in the 
ordinary sense siruje there is no one-to-one correlation of valu(?s of x with 
values of J. We call J a function of a 
function, or a functional. It is required 
to find those functions x{t) having con- a 
tinuous second derivatives which yield j 
a maximum or a minimum value of / ^ 
when we consider only comparison func- 
tions with the same terminal values 
x{u) = Xo, x{ti) = Xi. 

Euler^s Equation. — In the case of an 
ordinary function y of an independent 
variable x, a necessary condition for a 
maximum or minimum is that dy/dx 
or the differential dy shall vanish at the point under consideration. 
A similar necessary condition for the minimization of J can be set 
up and yields a differential equation (Euler's) for x{t). A solution 
of this equation subject to the given terminal conditions may yield 
either a maximum or minimum value of J or a generalized ‘'point of 

^ C/, Sec. 3, p. 9. 



557 



558 


APPENDIX A 


inflection/' For many physical applications it makes no difference 
which. 

In order to formulate Euler's equation we form the one-parameter 
family of comparison functions 

x{t,a) = <p(t) + ar}{t), (A-2) 

in which (p(t) satisfies the terminal conditions and r}(t) is any continuous 
and twice-differentiahh^ function which vanishes at and h. Inserting 
x{tfa) for X in Eq. (A-1) we convert J into a function of a. 



Let us now assume that the function x = (p{t) yields either a maximum 
or a minimum value of J. It follows that dJ/da vanishes when a is 
zero, or, what amounts to the same thing, that the principal part of the 
increment of J due to an arbitrary increment 8a in a is zero if the initial 
value of a is zero. This principal part is indicated by the symbol 8J 
and is called the first variation of J. Its value is by definition 


dJ 



(A-4) 


Similarly if Q is any quantity depending on the function x(t) we define 
the first variation of Qhy the equation 


It will be observed that since x{tya) involves <^(0 and 7/(0, 5Q depends 
on the choice of these two functions. To avoid confusion regarding 
this choice we shall sometimes replace the symbol bQ by the more explicit 
symbol 

Applying the above general definition of 8Q to the quantities x and x' 
we obtain 


8x = 
bx = 


doL 

dd 


x{t,a) 

xifyo) 



la-O 


ba = rj(t)da, 
ba = 

at 


In these special cases 5(2 is independent of i>e.j of the unvaried 
expression for a*(/). 

It is easy to prove that all the ordinary rules of the differential calculus 
apply to the computation of variations. Hence 


dF^ , dF,. dF, , dFd^ 

BF = —Bx + -^^5x ^-Bx+-^ j^Bx. 


(A-5) 



APPENDIX A 


559 


The limits of integration for J are independent of a and we may 
differentifiCte under the integral sign to obtain 


hj 


JT 


-yM dL 
dx dx at 


(A-6) 


On integration by parts this yields 


8J = 


n 

_ 

-( 

fdFV 

ho 1 

L dx 

dt' 


r 

dF 



}lo 

dx 

di 

\dxjj 


dF 

bxdt + —5x 
dx 


bxdt. 


to 


(A-7) 


In equations (A-5), (A-6), and (A-7) the partial derivatives of F are to be 
carried out treating as a function of the three inde])eudent variables 
Xj x, ty while the symbol d/dt denot(^s a differentiation with n^spect to t 
treated as the sole independent variable. 

If re = (p(t) gives a true maximum or minimum for J it is cjlear that 
b^^J must vanish for every choice of the arbitrary function bx — ay){t). 
J is then said to be stationary. Then the cofactor of bx in the right-hand 
member of Eq. (A-7) must vanish identically. In other words, x = ip(t) 
must be a solution of 


dx 



= 0. 


(A-8) 


This is Euler^s equation. 

The converse proposition that solutions of Eq. (A-8) which satisfy 
the terminal conditions yield either a maximum or minimum value of J 
is not correct, but we can say that such solutions give zero for the first 
variation in J. They are called extremals and ai*e of great intrinsic 
importance in mathematical physics independent of the fact that they 
include all functions wliich give maximum or minimum J values. 


Example I. — Let T and V denote respectively the kinetic, and poteniial energies 
of a particle moving in one dimension along a straight line. Let L denote t he Lagran- 
gian function or kinetic potential T — V. Identifying F with L we sc^e that t,h(‘, process 
of ^‘extremalizing^’ the integral 

J = r'lHiix* - V(.x)]dt 

Jto 

is equivalent to solving the differential equation 

Thus Newton’s law of motion is identical with Euler’s equation for the integral J. 
This is a special case of Hamilton’s principle. 

Several Dependent Variables,— A somewhat more general problem 
is that of ‘‘extremalizing^^ the integral/, when F has a number of argu- 



560 


APPENDIX A 


ment functions. For example, we may assume that F depends on 
iPi, ^ 2 f • • • Xn, Xn, i- Then a slight extension of the foregoing argu- 
ment shows that the functions define an extremal trajectory or curve when 
they satisfy the set of simultaneous equations 



If F is identified with the Lagrangian function or kinetic potential L 
in a multidimensional mechanical problem, tlie above Euler equations 
become the Lagrangian equations of classical mechanics. We conclude 
that if the equations of motion can be reduced to the Lagrangian form 
(A-9) by the introduction of a suitably defined function L having the 
coordinates, their velocities, and the time as arguments, the trajectory 
of the system executing its natural motion is such that 

S = 0. (A-10) 

This is Hamilton's principle in one of its more general forms. The 
comparison trajectories allowed in the formulation of this principle 
are restricted to those which carry the system from a common initial 
configuration A to a common final configuration B in the common time 
interval ti — to. If the forces acting on the system are derivable from a 
function V of the coordinates and the time, Lagrange^s equations and 
Hamilton’s principle are always applicable with 

L = r - F. 

Principle of Least Action. — The principle of least action in its usual 
form is restricted to conservative systems, z.c., to systems subject to 
forces derivable from a potential function V which is independent of the 
time. To set up this principle we consider the time integral of a multiple 
of the kinetic energy T of the system over its trajectory from an initial 
configuration A to a final configuration B, The value of the integral for 
the natural trajectory is stationary with respect to values obtained from 
neighboring non-mechanical trajectories having the same total energy and 
carrying the system from the same initial configuration to the same final 
configuration, but not necessarily in the same time. 

We shall verify the principle for the case of a single particle by 
showing that it is equivalent to Newton’s law of motion. Let E be the 
total energy and let ds denote an element of arc. The action integral S 
fof the path AB is defined as follows: 

s - (A-ll) 

Here p denotes the absolute value of the momentum for the point 



APPENDIX A 


661 


determined by the energy equation 

g = £ - V{x,y,z). 

The principle of least action requires that 5S == 0 for the natural orbit 
when the trajectory is varied, but E is held constant. 

Usually this makes S an actual minimum, but there are exceptions.^ 
We cannot apply Euler's equations directly to S as given by Eq. (A-11) 
since it is not in the standard form of Eq. (A-1). However, it may be 
carried over into the standard form by the introduction of an auxiliary 
variable u which increases monotonically from 0 to 1 as the particle 
moves from A to B. Let 

^ = /iW, y = ^ = hiu)y 

p = y/2ii[E - V{x,y,z)]. 

Denoting differentiation with respect to w by a prime, we have 

S = j^'p{E,x,y,z)[ixr + {yr + {zfyrdu. (A-12) 

As this expression for S is in the standard form we may write down Euler's 
equation for the x coordinate: 

+ iyr + {z'm - + (yr + = o. 


On reduction we obtain 


dp d( dx\ ^ 

Tx ~ ° 


(A-13) 


Multiply through by p/p and make use of the relations 


^ = ^ ^ dt, 

p V 


p\dp _ dV 
fjL/dx dx 


Equation (A-13) takes the final form 


dx + '‘df* “ 


which brings us back to Newton's law. As the other three equations 
reduce in the same way, we conclude that the principle of least action 
is equivalent to Newton's second law of motion. 

1 C/. E. T. Whittaker, Analytical Dynamics^ §103, p. 250, Cambridge, 1017* 



662 


APPENDIX A 


The distinguishing feature of the principle of least action is that it 
leads to purely geometric (time-free) orbital differential equations 
(A-13). Both Hamilton\s principle and the principle of least action owe 
their importance primarily to the fact that they afford formulations of the 
dynamical laws which are independent of any reference to particular 
coordinate systems. 

Least Action with Variable Mass. — In order to extend the principle 
of least action to problems in which the mass of the particle varies with 
its speed according to the relativistic law 

(A-14) 


we define the action integral by the equation [c/. (A-11)] 

S = (A-15) 

although in this case jivds is not equivalent to 2Tdt. To express the 
momentum p in terms of the space coordinates the Newtonian energy 
equation must be replaced by the relativistic equation from which 
we obtain 

p(x,y,z,Ery = ^-,\lEr - V{x,y,z)]^ - moV}. (A-IG) 

As in the Newtonian case, we derive Eq. (A-13) from the definition of S 
and this reduces immediately to the form 

p dp _ JL 

fjL dx 2 fi dx dt\dt / 

Introducing Eq. (A-16) we obtain the equation of motion 

(A-17) 

and corresponding equations for the y and z coordinates. As these are 
the correct basic equations for the motion of a particle with variable 
mass, the principle of least action is verified for this case also. 

Several Independent Variables. — The method of the calculus of 
variations can be applied when there are two or more ultimate inde- 
pendent variables. In this case we start with a multiple integral such as 

J[u] - ,u^,UyyX,y)dxdy, (A-18) 

This integral is extended over a region G of the x,y plane on the boundary 
of which the undetermined function u{x^y) is required to have predeter- 
mined values. U 9 and Uy are the partial derivatives of u with respect to 




APPENDIX A 


563 


X and y. Starting with a primary function <pix,y) which satisfies the 
boundary conditions we form the family of comparison functions 

u{x,y,a) = <p(x,y) + ari(x,y) 

as before. v(x,y) is arbitrary except for continuity restrictions and require- 
ment that it shall vanish on the boundary of G. Defining the first variar 
tion as before, we obtain i 


dF dF ^ \ ,^F d . j 


Integrating by parts this becomes 

/ Jis " sG^) " li©]*""'' 

JrldU:, (Is 


dF dx 
d Uy ds 


6uds, (A.19) 


r is the boundary of G. Since 8u vanishes on F, 6J vanishes, ^.6., J is 
stationary, when (p{x^y) is a solution of the partial differential equation 


t\dUx/ dy\dUy) 


(A-20) 


Example II. — Consider the vibrations of a stretched string of elastic modulus 
X and mass per unit length p. Let the string extend along the x axis from the origin 
to the point x — 1. Let u denote the displacement in the direction of the y axis, the 
other components of displacement being assumed to vanish. Hamilton's principle 
requires that the integral 


Jlu] = 


r<’’- 


shall be stationary for the natural motion. In this case 


and we have 


sj^^^dtj'^dxmpu^ — ViXUx^] - 0 . 


Equation (A-20) takes the familiar form 


d*w .dhi 


The calculus of variations and its applications to mechanics are 
discussed in a great many textbooks, but the writer is particularly 
indebted to the excellent treatment of the subject in Chap. IV of Courant 
and Hilbert's Methoden der mcUhemcdischen Physik, I Berlin, 1924 and 
1931, and warmly recommends this discussion to the reader. 



APPENDIX B 


DERIVATION OF EQUATION (16-7) ' 


We desire to evaluate the function of Eq. (15*6) for values 

of Xj 2/, 2 , t large enough to serve for momentum measurements. The 
components of the apparent, or most probable, momentum for any 
such set of values of 2 , t are 


^ = T> v = -Y’ 


f 


if? 

t ' 


It is convenient to introduce these along with t as the variables upon 
which ^ depends. The probability of the momentum px,VviVt will now 
be given by the limit of ^ when t becomes infinite while rj^ f have the 
constant values p*, Py, Pz- 

• To avoid cumbersome formulas we consider here only the one- 
dimensional case where is independent of y and z. There is no funda- 
mental difficulty in extending the result to any number of dimensions. 
We suppose, then, that 

= 


I G{<T)e L " 2m 
* 


In order to throw the exponential into more convenient form we make the 
change of variables 



Equation (B-1) now takes the form 



a& 


.2A2a! 




iriu» 

2 du. 


(B-2) 


It is convenient to split the integration into two parts covering the ranges 
u > 0 and m < 0, respectively. Introducing the Fresnel integrals* 

Ciu) = coa^-dq - H = -J^ cos^dg, 

S{u) = J^sin^|-dg-i^ = -J^ sin^dg, 

‘ Cf. Sec. 156, p. 62. 

* The definitions of C and 8 here used differ by the additive constant }4 from the 
customary definitions. 


564 



APPENDIX B 


565 


we have 


Ge ^ du — \ 

0 Jo 


G^{C ~ iS)du. 


Integrating by parts and observing that G vanishes when its argument 
becomes positively or negatively infinite, we obtain 


[G{C ~ iS)]du 


l\c-iS)fju 


J '* oo triu^ p e 

Ge 2 dw = 

0 Jo 

- ■ “X'“^ ■ + ““V"- 

Here 6?' denotes the derivative of G with respect to its original argument. 
Similarly the range u < 0 gives 

j Ge ^ du — j Gy^ — auje ^ du 

- {^y(i) - “X - -m - 

Combining Eqs. (B-2), (B-3), and (B-4), we obtain 

isb ' m ?) - - m ) + “ X "«' - ■'*>[‘''(1 + ““) 


rip 


- aw^jdw. (B-5) 

Let P denote the second term of the right-hand member of Eq. (B-5). 

P({,a) = aj^ (C - *^')[g'(| + ai^ - jdw. (B-6) 

By introducing suitable hypotheses regarding the behavior of the function 
(l + av^ — j for large values of u it is possible to prove 

that lim P(f,a) — 0. To this end we introduce au as the variable of 

at — >0 

integration and denote it by the symbol v. Let the real and imaginary 
parts of "■ respectively. 

Multiplying out the integrand, we can resolve P into the sum of four 
integrals of which the following is typical: 


- X ’^ s "’) X ;)*' 


(B-7) 



566 


APPENDIX B 


The function C 



form 


is of an oscillatory character with the asymptotic 



a . TTV^ 
= — sm ^ 
TTV 2a- 


for large values of the argument. C{v/ a) approaches zero as a approaches 
zero for all values of v except zero, where it has the constant value 
At the same time the positive and negative arches get closer and closer 
together and by mutual cancellation tend to reduce the integral. Hence 
it is to be expected that if the integral Pi(J,a) exists at all [as it must if 
Etj. (B-1) is valid], 

lim Pi(^,a) = 0. (B-S) 

a — *0 


The author has not been able to devise a rigorous proof of the above 
statement for the general case but is indebted to the late Prof.O. D. Kellogg 
for the following proof for tlie special case that Ki{^/h^v) is a function of 
hounded variation^ in v. It follows from the method of deriving Ki 
from G(cr) that Ki like G is analytic. Hence the requirement of a bounded 


variation reduces to the condition that 


X' 


dKi\ 

dv 


dv shall exist. This 


restriction is very mild since G(<t) is quadratically integrable. It permits 
us by a fundamental theorem to replace Ki by the difference of two 
bounded, positive, monotonically decreasing functions which approach 
zero as v becomes infinite. Calling thcvse functions A (£,«;) and P({,v), 
respectively, we have the relation 


Pi({,a) =Jhn • (B-9) 


By the second theorem of tlie mean 



A{^,v)dv = A(f,a) 



where 0 < ^ < 6. Due to the oscillatory character of C(v/a) the integral 



dv is the sum of an alternating series of terms representing 


the areas of the successive positive and negative arches of the integrand. 
Each of these terms is smaller than. its predeccvssor and hence the sum is 
less than the area of the first complete arch to the right of the point t; = 0. 
Let I denote the area of this largest arch: 


I - ar^C(z)dz, 

J0.406 ^ * 


* Cf. Hobson, Theory of Functions of a Real VaHabUf edition of 1907, p, 256. 



APPENDIX B 


567 


and let M denote the upper bound of Then, 

and 

K) ^ 2ML 

Consequently, if b becomes infinite, 

IJ^ c(^A{^,v)dv ^ 2MI. 

Next let a approach zero. Then 1 goes to zero, and hence 

lim I c(^A(^,v)dv = 0. 
a-^o Jo \oc/ 

The same argument applies to the second term of Eq. (B-9), and so the 
theorem of Eq. (B-8) is proved. 

As Pi is one of four similar integrals into which the second term of 
Eq. (B-5) has been resolved, we infer that 

- (1 - m). 

Inserting the values of a and f we have that, for large values of £• 

»<*■'> - - Hu) 

This is a one-dimensional equivalent of Eq. (15 7) of Chap. II. 





APPENDIX C 


THEOREMS REGARDING THE LINEAR OSCILLATOR PROBLEM' 


Several statements occur in the text of Sec. 19 which require a naore 
rigorous discussion. The points in question are the following: (1) tq 
show that yp{x) becomes infinite or approaches zero as x approaches either 
boundary; (2) to show that for a given a and E there is one, and only 
one, integral curve which vanishes at a: = — oo ; (3), to show that solutions 
which vanish at x = ± qo are quadratically integrable and hence satisfy 
the complete boundary condition A of Sec. 17. In this appendix we 


shall verify these statements and prove the integrability of 

IdVI" 


#1 


dx\ 


and 


dx^ 


To prove point (1) we begin by observing that, since the curve is 
convex to the axis in jf/, its slope cannot change sign more than once. 
Consequently there exists a constant Xx such that if x > Xi > x", ^(x) is 
monotone. Then either ^(x) becomes infinite with x or it approaches 
a finite limit, say B, We wish to prove that in the latter case J? = 0. 
To this end we assume that J5 5^ 0 and seek a contradiction. Without 
loss of generality we may assume that B is positive. Then if x > xi 
the curve will be monotone decreasing and ^(x) > B. By hypothesis, 
F(x) is greater than E throughout the region H. Hence k(V — E) has 
a positive lower bound M in the region x > xi > x". We conclude that 
in this region 


Then 


= k(V - E)4> > MB. 
\l/\x) > ^'(xi) + MB{x — xi). 


Integrating again, we obtain 


^(x) > ^(xi) + ^'(xi)(x - Xi) + ^^(x - Xi)2, 


By hypothesis, i^(ao ) is finite. But this is clearly possible only if 5 = 0. 
Hence ^(x) must become infinite or approach zero as x approaches 00. 
In exactly the same manner we can show that ^(x) becomes infinite, or 
approaches zero as x approaches — 00 . 

^ C/. Sec. 19, p. 83. 


568 



APPENDIX C 


569 


For point ( 2 ) wc identify a and with the positive ordinate and the 
slope of the curve at some point x = ^in the region F, just as we did in 
the text. We consider the family of integral curves obtained by holding 
a and E constant, and varying No two of these curves can intersect 
to the left of $ for, if they did, the difference between the two functions 
would be a solution of the equation for the same value of E which crosses 
the axis twice to the left of x'. Hence if ) 3 ' > jS", we conclude that 
\l/{P\x) < \l/(P"jX), to the left of For small positive values of the 
curves will first approach the axis as x moves out along the negative 
axis, and then recede from it to become positively infinite as x approaches 
— 00. For a sufficiently large positive value of /3 the curve is readily 
proved to cross the axis at some point xi. As Xi becomes increasingly 
negative, jS must steadily decrease. Hence /3 approaches a limit fii 
as xi — > — 00 . Then the curve \l/{a,^\yE,x) is a solution of the differential 
equation wdiich satisfies the requirement that \l/(— co) = 0 . For this 
curve cannot cross the x axis to the left of $ because, if it did, Pi could 
not be lim P{xi), Then it must approach the x axis as a limit. If it did 

Xl — * — 00 

not, we could find a curve coming still closer to the x axis without actually 
reaching it. This curve would have a larger and its existence would 
require that there be two (curves for the same p^ one crossing the axis the 
other not. Hence there is an integral curve for each pair of values of 
a and E which vanishes at a; = — oo. There can be only one, for other- 
wise the difference would be a solution of the differential equation crossing 
the axis in F but having a node at — <» . 

For the last point we proceed as follows. Consider the two functions 
\l/i(x) and ^2(x) which have the same positive ordinate at x = f and which 
vanish at x = — oo . Let ^i(x) satisfy the Eq. ( 18 * 2 ) and let ^2(x) be a 
solution of 

^ 2 " + KiE - F2)^2 =0, 

where F2 is a constant greater than E but less than F(x) when x < f. 
We assume that f < x'. Under these circumstances it follows that ^2 
is greater than ^1 in the region — «> < x < {. To prove the point we 
assume its converse 

^2 ^ h 

to hold in a region M to the left of the point x == {. At the boundary 
points of Af , say Xa and Xf,, the two curves may be supposed to cross. As 
a special case, the region M may extend from — 00 to From the 
differential equations satisfied by ^2 and we have 


— ^2^1) = 'Pi ^^2 ^2'Vi = 



570 


APPENDIX C 


Hence 


= « p(F - Vi)h'l'idx. 

IXa J Xa 

The right-hand moniher of this t^quation is essentially positive and, since 
the curve \p2 cannot have the greater slope at Xa while cannot have the 
greater slope at xt, the left-hand member cannot be positive. Hence we 
hav(^ a contradiction and no such region M can exist. We conclude 
that 1^2 > at ev(Ty point in the region — oo < x < (, 

It follows that if the integral 


exists, the function must be quadratically integrable. But ^2 has the 
explicit form 


^2 


2irx 

- Ai^ 


^/2niVi-E) 


X < ^ 


from which it follows that the above integral does exist. A similar 
argument can b(i used to prove tlie quaaratic iniegrabiiity of any real 
discrete eigenfunction in the neighborhood of the right-hand boundary 
and hence over the complete fundamental interval — 00 < x < -f- 00 . 
Since the convei’gence of the integral is unaffected if we multiply ^ 
by any complex constant, we infer that, if y/ is any discrete eigenfunction, 
real or complex, 


exists. 

It is equally clear that since the convergence of 

^x*^\l/2^dx implies the convergence of We readily infer 

that if yp is any discrete eigenfunction, the product of into any function 
of X which is continuous except at infinity, where it may have a pole or 
poles of finite order, is quadratically integrable. In particular, the 

V'^yj/yp'^dx and J_ ^ Vypyp^dx are convergent. 

If we multiply the differential equation ( 18 * 2 ) by and integrate 
from — 00 to -f- 00 we obtain 





Vyp^^dx - E 




If ^ is a discrete eigenfunction with the eigenvalue E, the integrals on 
the right converge, and the integral on the left must converge also. But 



APPENDIX C 


571 



Hence d\l/fdx is quadratically integrable. 

Finally we anticipate a result derived in Se(\ 22 wliicli states that if 
any two functions are quadratically integiiable any linear combination 
of these functions will be quadratically integrable. Since it follows from 
the differential equation that d^/dx'^ is a linear combination of ^ arid 
we infer that dmdx^ is quadratically integrable. 



APPENDIX D 


MATHEMATICAL NOTES ON THE B. W. K. METHOD^ 

I. BEHAVIOR OF THE FUNCTIONS /„ AND /. ON THE COMPLEX PLANE^ 

Consider first the special case of the parabolic potential curve 

V{x) = HkxK 

Introducing polar coordinates by means of the equations 
X — x' = X — x" — 

we express [n the form 

As in footnote 2, p. 92, wo assume a cut in the complex x plane along 
the axis of reals from to + qo and require that 6^ and be restricted 
to the range 0 ^ 0 ^ 27r. Let p{x,E) and 'p{XyE)~^^ be defined by the 
equations 

iv ITT xe'+e") 

p = 6 ^{fikry')^e 2 , 6 ^ . (D-1) 


The function w{x^E) is defined by (21*6) with the provision that the path 
of integration shall not cross the cut. p, and w are now uniquely 
defined at all points of the cut plane. The following expressions for 
these functions on different portions of the axis of reals are readily 
derived. (The regions F, G, H are shown graphically in Fig. 4, p. 81, 
and Fig. 6, p. 99.) 


Upper lip of cut in H: 

iV 

V = -i\v\, 

Upper lip of cut in G: 

V = bl, P~^ = b"^i 

InF: 

—t!! 

P = *b|, p-^ = e *1 p“^I, 


” - tX 
»> - xX’'”''**- 
^ ■ -xX''”'’**- 


1 C/. Sec, 21. 
« C/, p. 97, 


572 



APPENDIX D 573 

Lower lip of cut in G: 

V = -|p|, w = -^£\p\d^. 

Lower lip of cut in H: 

V = -i\p\, p-'-^ = -c*|p-«|, w = 

The above restrictions on and w afford unique definitions of 
and at all points of the cut plane. For convenience we identify 

fu and fv with the ])ortions of these functions spread out over the upper 
half of the cut plane. For the lower half of this plane we set 

iu = = p->^e-^^. (D-2) 

Then, Juy h are uniquely defined on the axis of reals, and for such 
real points we have the relations: 


In 

F: 

fu — fu, 

fv — fv. 

(D-3) 

In 

G: 

fu ~ ifv, 

Jv ” ifu. 

(D-4) 

In 

H: 

4.inCx^\ 


(D-5) 


fu and Ju both become infinite at each end of the axis of reals, while fp 
and/r approach zero as x moves out to infinity along either the positive or 
negative axis of reals. For large values of \x\ the functions /w, fu are 
dominant^ fv, fv are subdominant. 

Further information can be derived from an examinatibn of the lines of 
constant absolute value and of constant phase angle (argument) of the 
functions A line of each type can be drawn through every point 

of the complex x plane where p does not vanish. Separating x into real 
and imaginary parts and describing p by polar coordinates we write 

X = u + iv, p = \p\eM. 

The differential equation for the lines of constant absolute value is 

^ = -tan X = -tan he' + 8" - w). (D-6) 

With the aid of this equation the general form of the lines i^? readily 
sketched in, as shown in Fig. 6, p. 99, Sec. 21d. p is real along the 
perpendicular bisector of the! line or'x", and hence decreases as the 
point X moves away from the axis of reals on this line. At large distances 
from x' and x" the exponential factor is the controlling one in deter- 
mining the magnitude of the functions /« and fv. We conclude that the 
four cross-hatched regions in Fig. 6, have the following properties. In A 
and B fu and fu are dominant while /», fv are subdominant. In C on 
the other hand, fv is dominant while fv is dominant in D. 



674 


APPENDIX D 


As 

Q 

and 

Q 


p* 


V 


become infinite at any simple zero of any curve 


on which and are good approximate solutions of (18*2) will have to 
keep away from the points x', If such a curve is to connect the 
interval F of the axis of reals with the origin or with the interval H, it 
must inevitably penetrate one of the regions C and D, 

In the more general case of an anharmonic oscillator, such as that 
contemplated in Fig. 5, the function p^/2iu = E — V{x) will have pairs 
of conjugate complex roots in addition to the real roots x' and x". These 
roots complicate the discussion somewhat, making it necessary to intro- 
duce new cuts radiating one from each complex zero to infinity in order 
to get single-valued functions /«, In order to study the variations 
of at* and av along a good path T by means of Eqs. (21 T8) and (21-19) 
it is necessary to avoid discontinuities by choosing a path which does 
not cross any of these cuts. The behavior of the lines of constant 
in the neighborhood of x' and x" remains qualitatively the same, 
however, provided that all the other roots are relatively distant from 
the real pair. Hence any suitable path F for which /xp <3C 1 must enter C 
in passing around x' and x" in the upper half plane. 


IL EVALUATION OF THE LOW-ENERGY LEVELS OF A ONE -DIMENSIONAL 
OSCILLATOR BY THE B. W. K. METHOD ^ 

In the case of the lower energy levels of a one-dimensional oscillator, 
one can establish the Bohr formula (21-10) by means of direct paths 
leading from the region A of Fig. 6 to JS around x' and x" without touching 
the axis of reals in G. Let us assume that the coefficients aw(x) and 
av(x) are to be fitted to an eigenfunction \l/n(x^En) of Eq. (18-2) [c/. 
Eqs. (21*15) and (21-16)]. and /« both vanish at each end of the axis 
of reals, while fu does not. Hence the coefficient au{x) must vanish at 
x = ± cc on the axis of reals. 

We assume that the plane is cut in the manner suggested above, so 
that /u, fv are single-valued on the upper half-plane and /«, fv on the lower 
half-plane. The values of a^, av for the lower half-plane will be indicated 
by du and dvj respectively. On that portion of the axis of reals which 
we have labeled F (x < x') the pairs of functions and/w,/v are equal. 
Hence the two sets of coefficients Uu, a„ and aw, dv are equal on F for all ^ 
functions. On the other hand, no such equality holds in general on the 
put portion of the axis of reals. The essential feature of our derivation 
is to show that if there exists a suitable good path F joining a pair of 
points Pi in F and P4 in H and crossing no cuts, the relations 

a„(x) = a„(x) « 0, a„(x) = av{x) (D-7) 

hold approximately for large real values of x in H, provided that ^ = ^n. 

1 C/. p. 105. 



APPENDIX D 


576 


We further assume that the inequality (21*32), or an equivalent ine- 
quality, holds for Pi, and that \au(Pi)\ is bounded in the same way. 
Under these circumstances we can neglect a,* (Pi) and au(P 4 ). 

Since T is to cross no cuts, it will enclose no complex zeros of P — V{x) 
between itself and the axis of reals. Let us choose for F a path lying 
whcdly in the upper half-plane and denote its image in the axis of reals 
by f . 

As is dominant in the Stokes region C, a„ is sensibly constant over 
the portion of F which lies in C. Since a,* is negligibly small at Pi and 
P 2 , av will be sensibly constant on the portions of F in A and B, The 
reflected path F has the same properties as F, and, as dr (Pi) = «« (Pi), 
we infer that (iviPi) = (iv{Pd when small quantities of the order of /ir 
are neglected. But, 

O/vfvijP) ^n(^) 


at X == P 4 , and it follows from D-5 that 


= 1. 


(D-8) 


This leads at once to the quantum condition (21*10) for the eigenvalues 
of (18*2). 

The accuracy of this result is limited only by the quality of the path F 
and the finite values of the constant N of (21*32) at Pi and P4. In the 


special case of an harmonic oscillator, where V = there are no 

complex zeros of p{x,E) and the path can be pushed as far away from 

x' and x" as desired. Since \—\ and |”| vary as |x|~® and |xl“^ for large 


values of |x|, it follows that by swinging around the origin in a wide 
enough arc we can reduce the error in (Z>-8) below any assignable value. 
Thus the B. W. K. method yields a rigorous independent proof of the 
energy-level formula of the harmonic oscillator of Sec. 20. 


III. TRANSMISSION OF MATTER WAVES AT A POTENTIAL BARRIERS 

We assume that the potential function in the neighborhood of the 
‘‘hiir^ is nearly parabolic in form, so that in first approximation in the 
neighborhood of the hill we can write 

= 2ijl{E - U) = fik{x - X])(x — X2) - 

where Xi, X2 are the classical turning points of Fig. 7 and fi, r 2 , S 2 
are the polar coordinates of x — xi and x — X 2 , respectively. As in the 
/'valley ease previously discussed, we cut the complex x plane along 
the axis of reals from x = xi to x = -f-oo, restricting and ^2 to the 


1 Of. p. lip. 



576 APPENDIX D 

range 0 ^ ^ ^ 2ir. Following the sign convention of p. 110 we write 

P = ~ (D-9) 

Let w be defined by Eq. (21*6) with the provision that the path of integra- 
tion shall not cross the cut. As before we identify fu and /„ with those 
portions of the functions and respectively, which are 

spread out over the upper half of the cut plane. /«, }v are the correspond- 
ing functions for the lower half of the cut plane. 




fw = dominant 

y,' , .<^ / / / / y . 



dominant 

' // 




Fig. 27. — ^Level lines of the function |c*'"| and Stokes regions for the problem of the 

parabolic potential hill. 


If the turning points X\^ x<i of the hill problem are made to coincide 
with the corresponding turning points a:', x" of the valley problem, 
the momentum p at every point of the cut plane in the hill problem is 
equal to the corresponding momentum for the valley problem multiplied 
by i. Thus the lines of constant for the valley problem shown in 
Fig. 6, p. 99, become lines of constant argument in the hill problem. 
The new lines of constant absolute value arc the orthogonals of the old 
set. They are shown in Fig. 27 together with one of the new lines of 
constant argument MM, The arrow heads on MM indicate the diiection 
of increasing 

There are four Stokes regions a, 7, h associated with the four 
successive quadrants. The corresponding dominant functions are 
indicated in the figure. 

Let us assume that progressive matter waves are incident on the hill 
from the left side only. These will then be reflected as well as incident 
waves in the region A to the left of the hill in Fig. 7, but only emergent 
transmitted waves in the region C to the right of the hill. As noted on 
p. 110, the emergent waves on either side are of thefu type. We therefore 
assume that vanishes in the Stokes region a of the complex plane 
shown in Fig. 27, thereby insuring that it shall vanish on that portion 

fi2 0 

of the axis of realsdn the interval C where . « ^ is small. If there is a 
good path r extending from the interval A on the axis of reals^ to C, 



APPENDIX D 577 

passing around x\ and x^, in the upper half-plane, it will be a line on which 
ttu is sensibly constant. 

It follows that there is a connection formula of the type 

fu + ^ (D-10) 

'~A ' C 

in which the constant c is yet to be determined. By sotting equal to 
zero in the Stokes region y we obtain the companion formula 

/« = /u — /. + dh (D-11) 

L..^ I I I 

A C 

for waves incident on the hill from the right. Both relations are equally 
good for high and low barriers. They have been derived on the assump- 
tion that the potential function is parabolic but are valid for hills of more 
general form, provided that the other roots of J5J — V{x) are far enough 
away from xi and x^ to permit the required good path F connecting the 
regions A and C on the axis of reals. In the region A : 

fu = h f rap \p\d^ + 

/.=;. = ip|-H exp {-x[X - y}- 

In the region C : 

eV« = = |p|“^ rap bMfj' 

c-*/. = exp [“xX 

where 

if - iX"”'"'- 

It is useful to ihtroduce at this point the equation of continuity 
div I = 

in which I is the mass current density defined for charged particles in 
three dimensions by Eq. (8*5) of Chap. I. In the case of monochromatic 

matter waves is independent of the time and div I must vanish. In 
a one-dimensional problem such as that under consideration, it follows 
that the current is constant for all real values of the basic coordinate x. 
Setting the vector potential of Eq. (8*5) equal to zero, we obtain the 


(D-12) 

(D-13) 

(0-14) 



678 


APPENDIX D 


expression 



for the scalar value of the current in the direction of the positive x axis 
due to a single-energy wave function with space factor i/. I is evidently 
merely a multiple of the Wronskian of the functions ^ and 4'*- 

It follows from (D-12) and (D-15) that in the region A on the axis of 
reals /« and /» represent currents of unit magnitude moving to left and 
right, respectively. Since /„ = — /»*, the currents are additive, i.e., if 
a and j8 are any constants /[a/„ + $fv] = ![«/«] + IWfv]. In the region 
A the function to which (D-10) refers can be regarded as the super- 
position of an incident current of magnitude [cl** and a reflected current 
of unit magnitude. In the region C the expressions and 
represent a unit current moving to the right, while and represent 
a unit current moving to the left. Equating the net currents on the 
two sides of the hill for the special case ^ = ^o, we obtain Eq. (21 -40). 



APPENDIX E 


THE REDUCTION OF CERTAIN BOUNDARY-VALUE PROBLEMS 
BASED ON SELF-ADJOINT DIFFERENTIAL EQUATIONS TO 
VARIATIONAL FORMi 

Variational Problem A. — The theorem a stated in the text (Sec. 24) 
can be proved as follows. Forming the first variation of the integral 
J{y] defined by Eq. (24-2), we obtain 

+ Xpy) + y*i\6y + \p&y)]dx. (E-1) 

Since the operator A is self-adjoint, we can reduce this expression by 
means of Elq. (23*9), identifying the y of that equation with our present 
hy and the z with our present i/*. Using Eq. (24*5) we throw J into the 
form 

+ Xpl/) + SyiA.*y* + \fyy*)]dx 

= real part j2£^8y*(Ay + Xp2/)dx|* (E-2) 

This expression vanishes if y and X form a solution of Eq. (24*1) so that 
the eigenfunctions and eigenvalues of Eq. (24*1) with the boundary 
conditions (a) are eigenfunctions and eigenvalues of problem A, Con- 
versely, every solution of problem A is also a solution of Eq. (24*1), 
for if a function y{x) and parameter-value X do not form a solution of 
Eq. (241), we can always choose the arbitrary function 3^* so that 
6/ 5 *^ 0 and thus prove that 2 /,X do not form a solution of A, 

It follows as a corollary that the stationary values of J are all zero. 
Variational Problem B. — According to theorem 6, p. 131, a necessary 
and sufficient condition that <p{x) forms a solution of problem B is that 
tpix) and X^ = QM/^M form together a solution of problem A, 

It will first be proved that the condition is necessary. We adopt 
the notation introduced in Appendix A indicating that the variation of Q 
is based on the unvaried function y(x) and the particular variation 
8y = daifj{x) by writing 

5Q = dy^Q. 


IC/.56C. 24, pp. 130-132. 


679 



580 


APPENDIX E 


Then, if tp{x) is a solution of problem B, 



for every admissible •n{x). The corresponding variation of J is 
= V{ + xiV} = - VQ + 

If we give the adjustable constant X the value 



(E-3) 

(E-4) 

(E.5) 


it follows that is zero for every ry. Thus \ constitute a 

solution of problem A. 

Conversely, to prove the sufficiency of the condition, let us assume 
that the function ^(x) is a solution of problem A with the eigenvalue Xf. 
Since the stationary values of J are all zero, 


or 


Also 


m = -Qm + = 0 , 



+ Xf = 0. 


(E-6) 

(E-7) 


Eliminating X between (E-6) and (E-7) we obtain 



This completes the proof. 

Variational Problem C. — TTheonun c of Sec. 24 states that every 
solution <p of problems A and when normalized, yields a solution of 
problem C, and that conversely, every solution f of C is a solution of A 
and of B with the eigenvalue Xf =« — for the former problem. 

Problem C asks for solutions of the variational equation 5Q = 0 when 
the comparison functions y{x) are subject to the normalization condition 

^■[y] ” rpWdx = 1. (E-8) 

•/a 


This type of problem lies outside the scope of the elementary theory of the 
calculus of variations given in Appendix A. It can be treated by the 
well-known method of undetermined multipliers due to Lagrange, or 
by the following direct attack. 

According to jbhe definition of Appendix A, dy is defined as 




APPENDIX E 


581 


where y{x,a) is the one-parameter family of comparison functions 

y{x,a) = ip{x) + ari{x), (E-9) 

by is accordingly equal to ri{x)ba and is arbitrary except for boundary 
and continuity conditions. The condition (E-8) applied to this function 
y{x,a) does not lead to a simple restriction on t){x) and by. We therefore 
replace (E-9) by the alternative family of comparison functions 


y{x,a) 


<p ix) + otu(x) 


(E-10) 


The normalization requirement is now wseen to be satisfied if, and only if, 
u{x) is orthogonal to ptp, A permissible first variation in y for problem C 
is now defined by 





(E-11) 


where u{x) is required to be orthogonal to pip. With the aid of the 
restricted variations so defined we can proceed to the proof of theorem c. 

It is clear that if ^(x) is a solution of either of the problems A and B, 
and if a is an arbitrary constant, o^(x) is another solution with the same 
eigenvalue. It follows that every solution of the problems A and B 
can be normalized in accordance with (E-8). Let ip{x) be such a nor- 
malized solution. Then = 0 for arbitrary functions rj{x) 

which satisfy the boundary and continuity conditions. If so, the 
restricted variation b^^*[Q/N] calculated from (E-11) must also vanish, 
as b^^y is a special case of the unrestricted variation by. Clearly, 

= real part (pip,b^'‘y) = 0. (E-12) 

Therefore 

This proves that every normalized solution of problems A and B is 
also a solution of problem C. 

We have next to verify the converse proposition that every solution of 
C is also a solution of A and B. Let f (x) be an arbitrary solution of C. 
Then the restricted variation b^^Q is zero for every u(x) orthogonal to 
pf. We have to show that the unrestricted variation b^^J vanishes for 
every iy(x) which satisfies the boundary-continuity conditions. With 
ev^ry unrestricted variation by we associate the restricted variation 

5r“l/ = fiy - f(pf,«y). (E-14) 



582 


APPENDIX E 


This function is actually orthogonal to pf, for 

0)f,5r“y) = (j>M - (j>rM,Sy) = 0. 

Let us now form the unrestricted variation By Eqs. (E-2) 

and (E-13) 

Si'iJ = real part H + Xpf]da:|. 

Reducing the right-hand member, we obtain 

4 - X3j,“iV -I- real part (2(pf,6y){ -0[f] + XiV[f]}^. 

The first two terms drop out because f(a;) is a solution of problem C. 
The expression in braces vanishes if we give X the value 


Olf] 

Mf] 


= Qlf]. 


Thus SfJ vanishes for a proper choice of X and f(®) is accordingly a 
solution of il. By theorem b it is also a solution of problem B. This 
completes the proof of the theorem. 



APPENDIX F 


THE LEGENDRE POLYNOMIALS AND ASSOCIATED LEGENDRE 

FUNCTIONS! 

The definition and differential equation of the Legendre polynomials 
are given in Sec. 26. An equivalent definition is contained in the series 
expansion 

00 

n«0 


which is of importance in potential theory. 

The explicit formulas for the first five polynomials are 

p,ix) = 1, P,(x) = a:, P,{x) = ^x^ - 

Pi{x) = Piix) = + |- 


The general formula is 


Pnix) = 


13-5 


{2n - 1) 


n! 


jx” — 


njn - 1) a 

2(2ri - 1)^ 


, n(n - l)(w - 2)(« - 3) ^ _ 

2 • 4(2n - l)(2n - 3) 


} 


(F-2) 


The last term in the braces is 


n! 


2-4 


n{2n - l)(2n - 3) 


(n + 1) 


for even values of n and 


n— 1 

+(-i) ^ ^ 


n\ X 

— (n- l)(2n - l)(2n - 3) • • • (n + 2) 


for odd vftlues of n. 

The more important properties of these polynomials are contained 
in the following formulas; 

1 C/. Sec. 27, p. 145; Sec. 28, p. 150. 


583 



584 


APPENDIX F 


(n + l)P»+i = (2n + l)xP« — nPn-i, 
dPn ^ p / ^ n(xPn — Pn-l) . 
dx ~ - 1 


riPn = xP„' — P„_i', 



Pmix)Pn{x)dx 


J*^*Pn(x)x*‘dx 


2n + 1*”"’ 

= 0. m == 0, 1, 2, • • 


(F-3) 

(F-4) 

(F-5) 

(F-6) 

n - 1 (F-7) 


The definition and differential equation of the associated Legendre 
functions are given in Sec. 28d. The associated Legendre functions of 
zero order are the ordinary Legendre polynomials just described. The 
first of the functions of higher order are tabulated herewith; 


Pm(x) = (1 - x»)H, P,,,(x) = 3(1 - xO, 

P 2 .i(;i:) = 3x(l — x''^)^*, P3,i{x) — 15x(l — X®), 

Ps.i(a:) = - 1)(1 - x^)«, PM = 15(1 - x^-^i. 


The more important properties of these functions are given by Eqs. 
(F-8) to (F-15) below. 

(1 - r)(l - x^yip,.r = xP,.,+, - Pz-l.r+l, (F-8) 

(1 + T + 1)(1 - X^yiPur = P,+,.,+l - xPi,r^^, (F-9) 

(Z -H r)P,_i., = (2Z -I- l)xP,,, - (z - r -b l)P,+,.r, (F-10) 

(2Z -b 1)(1 - X^)^Pl.r = Pl+X,r+, - Pl-x.r+x, (F-11) 

(1 - X=“)HP,,, = 2(r - l)xP,.,_, 

- (Z -b r - 1)(Z - T -b 2)(1 - x^)^Pi,r- 2 , (F-12) 

(1 - x^y‘Pi.r = (Z -b t)xPj,,_, - (Z - r -b 2)P, (F-13) 

(1 - x^)^P,.r = (Z -b l)a:P*,, - (Z -b 1 - t)P,+i,, 

= (Z -b 'r)P/_i,T — ZxPf.T = — txP/.t -b \/l ~ x*Pj,t+i 

= TXP,.r - (Z + 1 - t)(Z -b r)Vl^^Pl.r-X, (F-14) 

/_1 “ (2Z + 1) (Z ^ 

The recurrence formulas are useful in deducing the functions of higher 
order and degree from the lower ones. 

For further information regarding the associated Legendre functions 
the reader is referred to Bateman’s Partial Differential Equations of 
Mathemaiical Physics^ Chap. VI, Cambridge, 1932. 



APPENDIX G 

THE GENERALIZED LAGUERRE ORTHOGONAL FUNCTIONS' 

The Laguerre polynomials are defined as the coefficients L*(x) in the 
expansion 

00 ^ Xt 

= r=rt’ < < 1 (G-i) 

k-O 

or by the equivalent formula^ 

Lk{x) = (G-2) 

They obey the differential equation 

xy" + (1 - x)y' + ky = 0 (G-3) 

and have the following important properties: 

Ljt+i(x) — (2A: + 1 — x)Li,(x) + k^L^-i{x) =0, fc > 1 (G-4) 

xLk'ix) = kLk{x) — k^Lk-i(x), k ^ 1 (G-S) 

f^\-^Uix)LUx)dx = (fc!)^5i.n.. (G-6) 

The derivatives of the ordinary Laguerre polynomials'* are called 
generalized Laguerre polynomials and are designated by the notation 

A."(x) = ^A*(x). (G-7) 

The integer n is called the order of the polynomial. The generalized 
Laguerre polynomials satisfy the differential equation 

xy" + (n + 1 - x)y' + {k - n)y = 0, (G-8) 

and have the orthogonality property 

e-*x'‘Lt"(x)L„"(x)dx = (G-9) 

Equation (G-9) is to be distinguished from the orthogonality relation 
^Cy. Sec. 29c, p, 160. 

*C/. Courant-Hilbbrt, M.M.P., pp. 79, 284; Riemann- Weber, D.P., p. 341. 
*C/. Coxtrant-Hilbbrt, M.M.P., p. 284; ^bmann- Weber, D.P., p. 341; E. 
SchrOpinqeb, Ann. d. Phyaik 80, 483 (1926). 

585 



586 


APPENDIX G 


between (R„i and (Rn'«. One relation cannot be obtained from the 
other by direct transformation. 

The first ten of these polynomials are 

U\x) =1, UKx) = -a: + 1, Wix) = ~1, 

L^^ix) ^ x^ — ^x + 2y L^^x) = 2x — 4, L^Kx) = 2, 

Lz^(x) = — X® + 9^2 — 18a; + 6, L^^ix) == — + 18x — 18, 

Lz^{x) = — 6x + 18, Lz^{x) = — 6. 

The explicit formula for the general case' is 

Z,.-(x) - 

+ - l)(t -_»)(* - » - J. (G-IO) 

The following integral formula derived by Schrbdinger is also useful: 


b 

x’>e-^Lk\x)U'''\x)dx = - w - t) 

’x"(/-7: ,)(-7 ‘)- (o-n. 


Here h is the smaller of the two integers k — n and fc' — n'. The 
parenthesis symbols denote binomial coefficients and are defined by the 
relations 



In the special case that 

n = n', k — k', p = n + 1, 

all terms in the above sum vanish except those for which t takes on the 
values A; — n and fc — n — 1. Then (G-11) yields 


'• 00 

X' 

Jo 


= (n + l)!(fc!)2(-l)®*-” 


-G)l:V-0} 


■n - 2' 


k 


:) 


(2* - n + l)(fc!)» 
(fc-n)!. 


(G-12) 


^ Cf. Condon and Mobbb, Q.M., p. 68. 



APPENDIX G 


587 


Another special case of (G-11) of interest to us is that in which p = n + 1, 
n' = n — 2, k' = k — 1. Then, 


. ^ 00 

Jo 


»(x)rfa: = {n + l)\k\{k - l)!(-l)*-»-i 


X 



- 2 \1 
2j\k — n — l/j 
•S{k\y{k - l)!(2fc-n + 1) 



{k - n)l 


(G-13) 


The function is called a generalized Laguerre orthogonal 

function of order n. It satisfies a Sturra-Liouville equation with eigen- 
values ranging from n to infinity. Hence the Laguerre orthogonal 
functions of order n form a complete system. This is not true, however, 
of the radial wave functions Kniir), as can be seen from the fact that the 
discrete eigenvalues of Flq. (29T) have the upper limit zero. The 
explanation of this apparent discrepancy lies in the fact that the relation 
between the independent variables r and x involves the eigenvalue 
n = \/^ directly (c/. p. 160). 



APPENDIX H 


TWO THEOREMS RELATING TO THE CONTINUOUS SPECTRUMJ 


Theorem I. — Let v\x,enQ>)\ denote a real eigenfunction of the problem 
/3 of Sec. 30c; let i„(6) denote the distance from b to the preceding node of 
j/[x,«b(6)]; and let b' denote a suitably chosen point in the interval 


Then, 


b — ln(b) < X <h. 


A*(n,6) s - e„(6) = 


yAh',e„{h' )Y 

y[x,tn(b')Ydx 


(H-1) 


Proof: It follows from the definitions of Z„(i>) and y[x,fnib)] that 
y[x,tn+iib)] = y[x,fn(Jb - Z„(6))], 

or that 

tn+iib) - e„(6 — Lib)). (H-2) 

But by the first mean-value theorem 

€»(6) = €„(6 - Lib)) -h Ub)(^^'^^> (H-3) 


where V lies in the interval b — Lib)' < x < b. It follows that 

*»+i(6) - e„ib) = . (H-4) 

(den/db)b' can be evaluated by means of Green's formula 

^yix,E)ytix,Ei) - y(x,EOy*(a;,JS?)J* = siE - Ei)j^yix,E)yix,Ex)dx. 

(H-5) 

We identify yix,E) with y[a;,eB(i>)], and yix,Ei) with a second member of a 
one-parameter family of integral curves which vary continuously with the 
second argument and satisfy the s.p.b.c. at a; = o. Equation (H-5) 
becomes 

yib,Ei)y,[b,tnib)] = x[Ei - fnia)]jy[x,e„ib)]yix,Ei)dx. (H-6) 

Denoting the partial derivative of yix,E\) with respect to the second 
» C/. Sec. 30d, p. 168. 


688 



APPENDIX H 


589 


argument by yE(x,Ei), we divide (H-6) by [E\ — «n(6)] and allow E\ to 
approach en(5) as a limit. In this way we obtain 


(H-7) 

Differentiation of the identity 

j/[5,6„(fe)] = 0 

with respect to h yields 

_ yil5,t»(6)l /TT o\ 

db ~ VElbMi)] ^ ’ 


Combining (H-4), (H-7), and (H-8) we obtain 


*«+x(5) - «„(5) = ^ 


y.[b'Mb')]^ 

y[x,t„(J>)]Hx 


(H-9) 


as was to be proved. 

Theorem II . — Let y(XjE) denote any family of integral curves of 
Eq, (30-10). For every positive value of E it is possible to choose constants 
Aj Xoj 7 , Mj such that the B. W, K. approximation 


u(XfE) == Ap^^ cos 


+ ■>} 


conforms to the inequality 


y{x,E) - u{x,E) 
A 



X > Xo 


(H-10) 


Proof: Having chosen a definite value of E we pick a value of xq 
greater than the largest root of J? — V{x). Let Xi be an arbitrary point 
to the right of Xo {xi > Xq). There is a B. W. K. approximation having 
the same ordinate and slope as y{XjE) at xi. Let 

u{x,xi) = X{xi)p-'^ cos p(XiF)dx + 7(^i)| (H-11) 

denote this approximation. The difference 

i{x,xi) = y(XjE) - ii{x,xi) (H-12) 

will then vanish with its first derivative with respect to x at xi and 
can readily be proved to satisfy the inhomogeneous differential equation 

g -f k[E - F(x)]{ = g(.x,x^,E), 


(H-13) 



690 


APPENDIX H 


where 


g(x,xi,E) = - 


il{x,xi) 




[cf. Eq. (2M2)]. 

Let yi(x,E)y y^ix^E) denote any two real linearly independent solu- 
tions of (3010). Their Wronskian, W = — y 2 yi is independent 

of X, The function 


v(x,x^) 



gyidx — yi 



(H-15) 


is, like a solution of (H-13) which vanishes together with its first 
derivative at 3 : == 0 : 1 . Hence f and are identical. 

We desire to show that approaches a definite limit as x\ 

becomes infinite. To this end we set up an upper bound for \g{XjX),E)\. 

\x^V'\ and \xW"\ have upper bounds Dy and Z> 2 , respectively, in the 
interval .To < a; < . Let p denote a positive real number less than 

unity. We assume that xa is so chosen that E ~ V{x) > pE for all values 
of x greater than xo. Then, if a: > x*o. 


\g(x,xi,E)\ < 


IA(x,)| 


4{2yyiipE)^' 


4 


4 xpE\x^ 


(H-16) 


^i,xC), yi{x,E), yi{x,E) are bounded, so that from (H-16) it follows that 
the integrals in (H-15) converge as Xi — ♦ w and we see that 

lim [y{x,E) — UiXjXi)] = y{x,E) — lim u(x,xi) 

ari— » 0* xi~“* « 


exists. 

But if il(XjXi) approaches a definite limit as a: 1 becomes infinite, A 
and y must approach corresponding limits A and 7 , respectively. We can 
accordingly identify the function u(x,E), which appears in the statement 
of our theorem, with lim u(x,x\). 

By (H-15) and (H-16) we now have 


y{x,E) - u(x,E) 

A ’1 


with 


^ “ 4(^[^* iii} 


(H-17) 

(H-IS) 


Since \yi\ and |yj| are bounded in the region a: > 0, Eq. (H-17) yields 


y(,x,E) - u{x,E) 
A 



(H-19) 


a: > *0 



APPENDIX H 


691 


where 


N 


(H-20) 


Tliis proves the theorem. 

It is to be observed that in general all the constants Xo, .4, 7 , ilf, 
which appear in the statement of the above theorem, vary with E. 
Only p can be chosen to be independent of E. It would be vc^ry useful 
if we could find an upper bound for M independent of E, but unfor- 
tunately such a bound does not exist. If we consider a case in which the 
potential energy approaches the value zero at infinity from below, we 
can treat xq as a constant. Then N varies as E’~^ for small values of E. 

It follows from the B. W. K. approximation itself that 

varies as Hence, for small values of E, M varies as E ^ . 

Corollary: If we differentiate (H-15) and then allow Xi to beconui 
infinite, it is not difficult to prove that for any fixed energy 


\y\x,E) - u'j x.E) 

A 

"{x,E) - u"ix,E) 


Ni\ 


< 


N, 


X > Xo 


(H-2]) 


where Ni and N 2 are positive constants. 



APPP]NDIX I 


CONCERNING THE EXPANSION OF Hf IN SPHERICAL 
HARMONICS' 

We prove first the convergence of Let Rim be defined by 

the equation 

— GlmY Im Rim, 

the notation being that of Sec. 3(H‘. Since 

Gim = JifYim* sin e dOdip, 
we have/JEjmPim* sin 6 dddip = 0. Hence 
////* sin 0 dddip = sin 0 dddip + sin 0 dddip 

and 

+ fj sin 0 d0£^\R,m\^dv]dr. (I-l) 

As / is quadratically integrable over x,y,z space, the integral "r^lGjml^dr 
must converge. 

The relation Fjm(r) = r^'A/g/m, p. 176, will next be verified, thus 
establishing our right to apply ff term by term to the series (30-37). 
By definition 

fiin(r) = fHHf)Yim* sin 0 d0d.p = sin 0 dddip 

+ sin ® Mip. (1-2) 

Let (£'')op denote the angular-momentum operator 

(L )°p 4^2[sin 0 ® dd) sin*0 dip^ J 

of Eqs. (34-17) and (34-18). Then the operator H can be written in the 
form 

® + 255WV 

»C/. Seo. 80i,p. 176. 

592 



APPENDIX I 


593 


Furthermore 




(c/. Sec. 28d), and hence 


// 




sin 6 dedv> = 

S'B-'m 


+ 


a / 

arV ar / 


8xV“ 


+ F(r)(?,^ 

= r->Ajg,„. (1-3) 


The integral sin B dBdtp of (1-2) requires transformation. 

Thus 


// 


Yi„*(HRtn,) sin 0 ded<p = 2^2 J J 

__rr P. f(o2\ V. . 

2jar“ 


Yim*(,{£^)opRim) sin 6 dddip 


2^J*J*^Jm((£^)opFj„*) Hin BdBdifi — d®|^[^sina(F(„* 


dRln 

dd' 


Clearly 


// 


jBjm((£®)opFj„*) sin 0 d0d<p = 


l(l + l)h^ 
4rrY 


// 


RimY im* sin 0 dddip = 0. 


Since and Fim* are periodic in with period 27r, 


Jo Jo dv?Lsin A 




•n ^ * im 

dip dtp 


•)]-»■ 


Finally 

JjF,„*(ffB,„) sin adadv. = [«m a(F,„ 


^ dRim 

ae 


- /2 


lin~ 


ee 


- ) dtp = 0, 
/ Je-oi 


(1-4) 


Combining (1-2), (1-3), and (1-4), we obtain the desired relation 

Fimir) = 

The quadratic integrability of rFi^^ or Aig;,,, can now be established in 
the same manner as that of rGim- 



APP?]NDIX J 


THE JACOBI POLYNOMIALS ‘ 


The Jacobi polynomials G(t) are terminating hypergeomotric scries 
and are related to the general hypergeometric functions 

F(a, p,y,t)r^l+~ + + . . . (J.l) 

by the formula 


Gp{a, b, t) = P(-p, a+p, b, t), 


(3-2) 


where a, 6, and p are integers. They are polynomials of degree p, 
as may be seen from the alternative definition 

^■'(l-O'G.d+d+s. 1+d, 0 = -<)*+'’]• (J-3) 

The polynomials Gp(l+d+Sj 1+d, t) are solutions of the differential 
equation 

f(l — t)G^^ -|“ [1“}"^ — ~ 0. (J-4) 

The normalization and orthogonality relation is 

_ pKd iy(s+p )i /T .V 

” """'(c/+py!(f/+.sH-/0!(rf+s+2p+l)’ 

The following relations are useful: 

<r/p(i+d+s, i+f/, 0 = <i, <) 

d-i, 0], (J-6) 

^f?p(l+d+s, 1 +rf, t) — 2-{~(l, 1), (J-7) 
(l+d+sH--2p)(7p(l+d+s, 1+d, t) = p(jp^}{2-{-d+Sj 1 +d, /) 

“1“ (l+d+.9+7>)(7p(2+d+,s*, 1 (J-8) 

(1 -t-d‘-f’Sd~2p)(l H-d)Grp(l +d~[~6', l"|“d, 0 

=s — p(,s+p)(?p-i(2+d+s, 2+d, t) 

+ (14"d+s+p)(l+d+p)Gp(2+d+5, 2+d, t), (J-9) 

Opil+d+8, 1+d, t) - ^|Jp)!(-l)’'Gp(H-d+s, 1+s, 1-0. (J-10) 

> Cf. Sec. 34/, p! 234. 


594 



APPENDIX J 


596 


The polynomials of lowest order are: 


(7o(l t) == 1, 
l-f-d, t) = 1 * 


d"\~S"\‘2it 

d-|“l 


(j2(l+d+s, 1+d, 0=1 -|- 


(d+s+3)(d4“^+4) 2 
"(d+iy(d+2) 


References 

Frank and von Mises, DifferetUialgUichungen der Physiky I, 2d ed., p. 423, Berlin, 
1930. 

F. Reiche and H. Radbmacher, Zdts. f. Physik 39 , 444 (1926), 41 , 453 (1927). 

R. DE L. Kronig and 1. 1. Rabi, Phys. Rev. 29, 262 (1927). 



APPENDIX K 


SCHLAPFS METHOD! 


The secular equation, det(<"”‘^Hi — = 0, can be written in the 

form 


W Dt 

0 . . 

Di W 

D 2 • • • 

0 2)2 

w . . 


. . JV 

• 

. . Dn-m-l W 


where W — and 


(K-1) 


k(2m + k)(n — m — k)(n + m + fc) 
[2(m + A:) - l][2(m + fc) + 1] 


(K-2) 


In a determinant of the above type it is permissible to multiply any 
off-diagonal element by any number a not equal to zero, provided that 
at the same time we multiply the conjugate element by 1/a. Hence 
(K-1) is equivalent to 

W 

1 W 
0 1 


Transferring, the first and last factors of the numerator of Dk^, and the 
first factor of the denominator, to the conjugate element, we obtain a 

determinant of the form Kn, 



w 

(2m+l)(w — w~ 

D 0 

0 

2?w+l 


1 • (n+m+l) w 

(2m-|-2)(n— OT- 

ll) 0 



2m +3 


0 

2{n+m+2) 

W 

(2m +3) (w — m — 3) 

2m-\-5 

2»re-|-5 

0 

0 

SCw+m+S) 

W 


2m+7 




= 0. (K-4) 


W 


D„_ 


W 


= 0 . 


(K-3) 


Cf. Sec. 60, p. 407. 


596 



APPENDIX K 


597 


The sum of the two nonvanishing off-diagonal elements of the rth row 
is now n — m — 1. If we add to the elements of each column the sum 
of corresponding elements of every alternate succeeding column we obtain 
a determinant in which all elements lying below the principal diagonal 
have the values TF, or n — m — 1, alternately. The elements on and 
above the principal diagonal are unaffected. Starting from the bottom 
we now subtract from each row the row next but one above it, thereby 
obtaining a new continuant in which the element (3,2) is zero. Thus 

Kn,n.{W) = 

0 
0 

(2m+3)(n — m—3) 
2m+5 

W 

= 0. 


w 

n — m — 1 
0 
0 


n — m—l 
W 

0 

0 


0 

(2m+2 )(n--m — 2) 
2m +3 

W 

n+m+l 
2m +3 


The new secular determinant factors into the product of the second-order 
minor 

W n— m—1 
n—m— 1 W 


and a determinant which is readily identified with ' Thus 

+ (n — m — 1) are roots of Kn,m{W). Similarly ±(n — - m 3) 
are roots of etc. Proceeding in this way we finally obtain 


as the last factor of An.m either W or 


\W 1 
1 W 


according as n — m is even 


or odd. Hence the roots of 
— (n — m — 1) to +(w — m — 1). 


include every other integer from 




NAME INDEX 


A 

Abrahani, 600 
Auger, 213 

B 

Barnett, 501 

Bartlett, 210 

Bateman, 584 

Bichowsky and Urey, 491 

Bieborbach, 140, 142 

Birkhoff, 25. 91, 105 

Blaton (see Rubinowiez and Blaton) 

Bochcr, 348 

Bochner, 36 

Bohr, 2, 72, 75, 90, 178, 179, 328, 374- 
378, 474, 479 
Bolza, 133 

Born, 2, 179, 379, 383, 427, 452, 482, 505 
Born and Fock, 431 

Born, Heisenberg, and Jordan, 370, 5J3, 
543 

Born and Jordan, 3, 348, 366, 371-373, 
427 

Born and Oppenheimer, 419 
Breit, 232, 450, 525 
Bridgman, 58, 76 
Brillouin, L., 43, 91, 398, 477 
Brillouin, M., 90 
Brinkman, 465 

de Broglie, 2, 3. 9, 10, 13, 19, 20, 90 
Byerly, 149 

C 

Carleman, 262 
Casimir, 1 79 
Compton, 491 

Condon (see Gurney and Condon) 
Condon and Morse, 150, 222, 234, 451 
Condon and Shortley, 338, 420, 462, 530, 
536, 639, 640, 545, 556 . 

Coolidge (see James and Coolidge) 
Courant and Hilbert, 87, 90, 118, 122, 
124, 130, 132, 136, 137-139, 145- 
147, 202, 216, 237, 410, 415, 563, 585 


D 

naneoff and Inglis, 502 
Darwin, 492, 510 
Davisson and Germer, 14 
Dennison, 535 

Dirac. 79, 201, 241, 248, 265, 267, 281 
283, 287, 299, 315, 339, 427, 435, 
441, 450, 452, 492, 509, 513, 545 
Doi, 408 
Drude, 8 

Dunham, 107, 157, 388 
E 

Eckart, 308 

(See also Hughes and Eckart) 
Ehrenfest, 49 

Einstein, 2, 4, 5, 76, 378, 448, 449 
Einstein, Podolsky, and Rosen, 244, 328 
Epstein, 403, 404, 408 

F 

Feenbcrg, 71, 98, 112 
Fermi, 450, 477 
Fischer, 259 
Fock, 29, 312, 477 

(See also Born and Fock) 

Fowler, 439, 446, 448 
Fowler and Nordheim, 109 
Frank and von Mises, 595 
Frenkel, 27, 34, 226, 376. 502, 525 
Fues, 163, 177, 178, 195, 388 

(See also Nordheim and Fues) 

Furry, 244, 328 

G 

Gamow, 179, 192, 194 
Gaunt, 525 

Germer (see Davisson and Germer) 

Gibbs, 55, 434 
Gordon, 21 



600 


NAME INDEX 


Goudsmit, 2 

{See also Pauling and Goudsmit; 
Uhlenbeck and Goudsmit) 
Gronwall, 210 
Gurney and Condon, 179 
Giittinger and Pauli, 545, 546 

H 

Halpcrn and Sexl, 403 
Hamilton, 9, 11 
Hartree, 420, 477, 551 
Haveloek, 11 

Heisenberg, 2, 3, 72-77, 223, 337, 366, 
377, 491, 525, 548, 551 
{See also Born, Heisenberg and 
Jordan) 

Heisenberg and Jordan, 492, 505 
Heitler and London, 420, 552 
Herzfeld {see Murnaghan and Herzfeld) 
Hilbert {see Conran t and Hilbert) 
Hilbert, von Neumann, and Nordheim, 
245 

Hill {see Kemble and Hill) 

Hiyama (see Ishida and Hiyama) 

Hu, 135 

Hughes and Eekart, 78 
Hund, 82, 317 
Hylleraas, 197, 267, 425 
Hylleraas and Undheim, 415 

I 

Ince, 79, 140, 168 
Inglis, 502 

{See also Dancoff and Inglis) 

Ishida and Hiyama, 408 

J 

James and Coolidgc, 197, 425 
Jeffreys, 91, 94 
Jordahl, 394 
Jordan, 241, 265, 432 

{See also Born and Jordan; Born, 
Heisenberg, and Jordan; Heisen- 
berg and Jordan) 

K 

Kellogg, 666 

Kemble, 96, 126, 157, 419 


Kemble and Hill, 18, 22, 236, 290 
Kennard, 491 
Klein, 21 

Kramers, 39, 91, 94, 107, 403, 502 

Kratzer, 157 

Kronig, 419 

Kronig and Rabi, 595 

L 

Lanczos, 405 

Landau and Peierls, 219 

Landc, 491, 499, 503 

Laiiger, R. E', 91, 95, 103, 107 

Langmuir, 536 

von Laue, 188 

licngyel, 394 

Lewis, 536 

London {see Heitler and London) 

M 

Maupertuis, 7 

von Mises {see Frank and von Mises) 
Morse, M., 135 
Morse, P. M., 106 

{See also Condon and Morse) 
Mulliken, 157, 423 
Murnaghan and Herzfeld, 13 

N 

von Neumann, 55, 71, 81, 114, 117, 
201-203, 220, 241, 247, 251, 255, 
259-261, 263, 267, 278, 284, 321, 
435 

{See also Hilbert, von Neiimann, and 
Nordheim) 

Nicholson {see Schuster and Nicholson) 
Niessen, 386 

Nordheim {see Fowler and Nordheim; 
Hilbert, von Neumann, and Nord- 
heim) 

Nordheim and Fues, 26 
O 

Oppenheimer, 163, 181 

{See also Born and Oppenheimer) 

P 

Pauli, 2, 21, 29, 179, 223, 224, 432, 441, 
491, 492, 510, 515, 517, 518, 525 
(See also Giittinger and Pauli) 



NAME INDEX 


601 


Pauling and Goudsniit, 160, 478, 482, 
491, 507 

Peierls {see Landau and Peicrls) 
Planchercl, 36 
Planck, 1, 2, 461 
Podolsky, 237 

{See also Einstein, Podolsky, and 
Rosen) 

R 

Rahi {see Kronig and Ral)i) 

Radenia(^her {see Rcicheand Radcinacher) 

Ram berg and Richtmy(‘r, 213 

Rayleigh, 1, 36 

Rek^he an<l Rademach('r, 595 

Rice, 109, 195 

Richtniyer (see ILiniiberg and Richtinyer) 
Rieinann and Weber, 84, 90, 130, 140, 
146, 147, 585 
Rieaz, 259 
Ritz, 410 
Robertson, 74 
Rojansky, 404 

Rosen (sec Einstein, Podolsky, n.nd 
Rosen) 

Ruark, 49 

Ruark and Urey, 157, 158 
Rubinowicz, 462 
Rubinowiez and Blat-on, 462 
Rupp, 14 

Russell and Saunders, 528 
S 

Saunders {see Russell and Saunders) 
Schlapp, 404, 596 
Schlesinger, 142, 177 
Schrodinger, 2-4, 9, 10, 16, 25, 79, 87, 91, 
158, 177, 222r 237, 328, 345, 380, 
404, 427, 451, 585 
Schuster and Nicholson, 11 , 39 
Schwarzschild, 403 
Sexl {see Halpern and Sexl) 

Shortley, 198 

(See also Condon and Shortley) 


Slater, 43, 55, 427, 452, 477, 479, 495, 
535, 555 

Sommerfeld, 2, 49, 87, 158, 222, 233, 451 

Stokes, 95 

Stone, 114 

Sughira, 423 

Swann, 5 

T 

Thomas, 477, 502 
Thomson, 14 
Titch marsh, 36 

U 

llhlenhcck, 2 

Uhleid)ec,k and Goudsmit, 491, 500 
Undheim (,scc Ilylleraas and Undheim) 
Urey {see Bichowsky and Urey; Ruark 
and Urey) 

V 

Van Urk, 482 

Van Vleck, 27, 108, 375, 376, 378, 383, 
394, 403, 408. 419, 449, 466, 479, 505 
Vinti {see Witmer and Vinti) 

W 

Waller, 408, 505 
Wang, 425 

Weber {see Riemann and Weber) 

Weisskopf and Wigner, 181 

Wentzel, 43, 91, 195, 408 

Weyl, 115, 125, 138, 163, 165, 223, 255 

Whittaker, 7, 561 

Wigner, 309, 317, 515, 517, 518 

(See also Weisskopf and Wigner) 
Wilsojn, 382 
Witmer and Vinti, 336 

Y 

Young, 9 




SUBJECT INDEX 


A 

Action function, 25, 44 
Action integral, 8, 25, 560 
Action variable, 375 
Addition of states, 201 
Adiabatic theorem, 431, 432 
Adjoint manifold of operator, 276 
Adjoint matrix, 350 

Adjoint operator, first definition of, 122n 
second definition of, 203, 512 
Alkali spe(;tra, energy levels of, 474-481 
theory of fine stnicture of, 503-507, 
519-522 

Alpha particles, emission of, 179, 187-192 
Analytic function, 140 
Analyticity of wave functions, 18, 198, 200 
Angular momenta, combination of, 495- 
498, 537-540 

Angular momentum, 151, 224-234, 314- 
317 

of electron spin, 500, 503, 504, 510-519, 
529 

internal, 228 

resultant of spin and orbital, 498, 504, 
520 

of system of particles, 224, 227 -234, 
292, 293 

Angular-momentum operators, 224-234, 
292, 293, 314-317 

Angular-momentum quantum number, 
151, 234, 493 

Antisymmetric wave functions, 337, 531, 
633-636 

Antisymmetrizer 635, 536 
Antisymmetrizing operator 9, 338, 529, 


Atomic- energy levels, classification of, 
493, 494, 538, 539 

Atomic model of Bohr, idealized, 474 
Atomic units, 420 

Atoms, complex, in problem A approxi- 
mation, 52{> 

in problem B approximation, 528, 
555 

in problem C approximation, 528 
with many electrons, 474-556 
perturbation theory of, 484-488, 
526-528 

Auger effect, 195, 213, 214 
Azimuthal quantum number Z, 151 

B 

Bessel’s inequality, 136 
Binomial coefficients, 586 
Bohr theory, 91, 108, 157, 178 
postulates of, 374 

Boundary conditions, linear homo- 
geneous, 125, 130 
physical, 126, 153 

singular-point, 125-128, 130, 131, 163 
Boundary and continuity conditions, 
78, 197-201 

Brillouin-Wentzel-Kramers method, 46, 
90-112, 152, 155, 157, 168, 168, 182, 
572-578, 687 

higher approximations of, 107 
modification of, for radial motion, 
107, 108, 155 

Broadened energy levels, 181, 183, 195 
C 


634, 535 

Assemblages, canonical, 434, 446 Calculus of variations, 130, 557-563 

chaotic, 438 Canonical assemblage, 434, 446 

concrete, 65, 319 Canonical (‘(jiiations {see Hamilton’s 

Gibbsian, 53-55, 329, 433-448 canonical eejuations of motion) 

mixed case, 54, 320 Canonical Heisenberg matrices, 367 

pure case, 320 Canonical S(;hr6dinger matrices, 369 

Asymptotic agreement of classical and Canonical transformation, 247, 356, 367, 
quantum theories, 51, 229, 230, 302 358 

603 



604 


SUBJECT INDEX 


Central-field approximation, 485, 526, 
541-543, 548 

Characteristic functions (see Eigenfunc- 
tions) 

Characteristic values (see Eigenvalues) 
Classical local momentum, 9, 20, 69, 81 
imaginary, 81 
Classical orbits, 331-334 
Closed shells, 536, 537 
Commutation and simultaneous measure- 
ments, 334 

Commutation rules, for angular-momen- 
tum components, 292, 293 
for conjugate dynamical variables, 282 
Commuting operators, 281-286 
Commuting set, of dynamical variables, 
normal, 286, 287 
of observables, normal, 339 
Complete set, of mutually compatible 
operators, 254 

of normally commuting dynamical vari- 
ables, 287 

of normally commuting observables, 
339, 531, 532 

Completeness of system of functions, 119, 
120, 136, 13*7, 144, 149, 150, 216, 
253, 254 

with continuous spectrum, 165, 171, 
172 

modified form of, 137 
Complex eigenvalues, of non-Hermitian 
operators, 242n 

of weakly quantized states, 192-194 
Configuration, electronic, 526 
Configuration space, 21, 115 
with electron-spin coordinates, 510- 
512, 523, 524 

Conjugate dynamical variables, 26, 74, 
282, 293-299 
measurement of, 334 
spectra of, 299 

Connection formulas, 94, 100-103, 111, 
183 

Conservation, of an arbitrary dynamical 
variable, 290 
of electricity, 222 
of energy, 288 
Continuant, 407 

Continuous spectrum, 85, 162, 588-591 
eUmination of , 163 
limit of, 211, 212 

of many-particle problem, 215-217 


Continuous spectrum, normalization of, 
164, 169, 170, 176-178 
and perturbation theory, 394 
Continuous-spectrum eigenfunctions not 
quadratically integrable, 226 
Convergence, mean-square, 138, 217 
of perturbation theory, 382 
Coordinate system, in quantum me- 
chanics, 236n 
of type 1, 254 

Corpuscular nature of matter, 87w 
Corpuscular theory of light, 2, 3 
Correspondence principle of Bohr theory, 
3, 367, 375, 451 

Coupling of individual electrons, 537-540 
Current! electric, 222 
mass, 31, 32, 110, 222, 577 
of probability, 1 89 

D 

Degeneracy, 147, 150, 217, 218, 311-317 
accidental, 312 
continuous-spectrum, 312 
Coulomb, 312 

physical t>s. mathematical, 531 
removal of, for set of simultaneous 
eigenvalues, 287 
sjmimetry, 312 

Density function, or density factor, 123, 
124, 128, 164, 239, 268^ 

Determinant, functional or Jacobian, 64, 
238, 248, 269, 295 
of matrix, 350 

Determinism and indeterminism, 6, 7, 76 
Diagonal sum method, 555 
Diatomic molecule, dumbbell model of, 
155, 386, 419 

fixed nuclei problem of, 419-426 
perturbation theory of, 386 
Differentiation of functions of operators, 
300 

Diffraction of electrons, 14, 70 
Dirac 5 function, 267 
Dirac notation for probability ampli- 
tudes, 268 

Dualistic nature, of light, 3-7 
of matter, 3, 52 

Dynamical variables, conjugate, 26, 
293-300 

mathematical definition of, 264, 276 
non-Hermitian, 275 



SUBJECT INDEX 


605 


Dynamical variables, operational defini- 
tion of, 318 

simultaneously measurable, 257 
symmetric, 337 
unitary, 275 

E 

Eigendifferentials, 164, 169, 170, 216, 252 
Eigenfunctions, 80 
of general Hermitian operator, 252 
simultaneous, 252, 284, 285 
Eigenvalue-eigenfunction problem, 80 
matrix form of, 359 
V. Neumann form of, 263 
as a principal-axis transformation, 362, 
372, 373 

variational forms of, 130-132, 206, 207, 
578-582 
Eigenvalues, 80 

complex, of non-Hermitian operator, 
242n, 275 

complex, of weakly quantized states, 
192-194 

of Hermitian operator (definition), 251 
reality of, 123, 128, 206, 275 
of unitary operator, 275, 276 
of variational problem, 131 
Eigenvectors, 359, 391 
Electron spin, classical theory of, 500- 
503, 524 

Pauli theory of, 510-522, 523 
Electron-spin hypothesis, 491, 492 
Electron-spin matrices, 512-519 
Electron-spin operators, 512-519 
Electronic configuration, 526 
terms originating in an, 537-540 
Energy, measurement of, 328-331, 344, 
347 . 

Energy levels, of alkali atoms, 474r-481 
broadened, 181, 183, 195 
classification of atomic, 493, 494, 538 
539 

splitting of, by perturbations, 388 
Energy operator “"giSdf* ^ 

Energy operators, 234-240 

(See also Hamiltonian operator) 
Energy variation when Hamiltonian 
depends on 289 
Equation of continuity, 31 
Equivalent electrons, 536, 539, 540 


Euler’s angles, 231 
Euler's equation, 557-559 
Even and odd permutations, 337 
Even and odd states, 314, 531, 541 
Every where-derise manifold, 201 
Exchange energy, 553, 554 
Existence, of eigenvalues of one-dimen- 
sional oscillator, 84 

of solutions, of many-particle problem, 
196, 208, 214 

of V. Neumann eigenvalue problem, 
278 

of the Sturm-Liouville problem, 128, 
129 

of system of simultaneous eigenfunc- 
tions, 284, 285 

Expansion, in series of functions, 1 13, 
120, 135, 226, 234 

series-integral, 164, 176, 215-217, 265 
Expansions with mean-square conver- 
gence, 138, 217, 242 

Expectation value (see Mean value, 
statistical) 

Extremals, 131, 559 

F 

Fermat's principle, 7-10, 12, 20, 22, 47 
Fine structure, of alkali spectra, theory 
of, 503-507 

of optical atomic spectra, 491-495 
Fischer-Riesz theorem, 259, 262, 274 
Fourier analysis, 12, 62, 67 
Fourier integral theorem, 36, 65, 162, 173 
Fourier transform, 36, 221 
Function space, 119 

(See also Hilbert space) 

Functions, of bounded variation, 566 
of class A, 79, 131, 153 
of class B, 80, 86, 162 

not quadratically integrable, 226 
of class /), 197-201 

with electron-spin coordinates, 524 
of non-commuting linear operators, 300 
of a normal set of commuting dynami- 
cal variables, 286 

physically admissible, 17, 79, 131, 197- 
201, 524 

of a single operator, 279 
Fundamental equation (see Indicial equa- 
tion) 



606 


SUBJECT INDEX 


G 

Geiger-Nuttall law, 188 
Gibbsian assemblage of independent 
systems, 53-55, 329, 433-448 
Gibbsian canonical assemblage, 434, 446 
Group, permutation, 309, 529 
rotation-reflection, 308, 529, 532 
of the Schrddinger equation, 310 
Group velocity, 10-13, 20, 39, 42, 49 
Gyromagnetic ratio, 501 

H 

Hamilton- Jacobi equation, 9, 24, 25, 43, 
44 

Hamiltonian function, classical, 23 
Hamiltonian operator, 24, 28 
for complex atoms with spin, 524-526 
transformation of, 237-240 
Hamilton’s canonical equations of 
motion, classicjal form of, 22, 24, 293 
matrix form of, 367 
operator form of, 301, 302 
Hamilton’s principle, 559, 560 
Hartree self-consistent field, 477, 551 
Heisenberg inequality, 73, 222 
Heisenberg matrices, 367 
Heisenberg uncertainty principle, 72-77, 
192, 222 

Helium atom, 209-212, 547-553 
Herniitian character of Hamiltonian, 

202-206, 462 V 

Hermitian domain, 203 
Hermitian manifold, 210, 251, 263 
of type Z), 251 
Hermitian matrix, 350 
Hermitian operator, 203, 251, 512, 524 
Hermitian orthogonal functions, 90 
Hermitian polynomials, 90 
Hilbert space, 114, 119 ti, 120 
Huygens’ principle, 45, 467 
Hydrogen molecule, 419-426, 552 
Hydrogenic atom, 157-161 
in electric field, 403-408 
in magnetic field, 398-403 
relativistic theory of, 507-509 
Hydrogenic states of atoms, 478 

I 

Identical particles, 335-339 
Identity of macroscopic bodies, 340 


Indicial equation, 141, 143, 153 
Integrals of the motion, 291, 311-31 r 
396-398, 528-533 

Interchange operators or transposition k, 
308 

Intersystem combination lines, 527 
Invariance, gauge, 29 
with respect to substitution, 303 

J 

Jacobi polynomials, 234, 594, 595 
Jacobian determinant, 64, 238, 248, 2f!9, 
295 

K 

Kronecker symbol, 90 

L 

L complex, 316, 539 
Lagrangian function, 23n, 293, 560 
Laguerre orthogonal functions, 585-587 
Laguerre polynomials, 160, 585 
Land6 magnetic core theory, 498, 499 
liaporte rule, 540, 541 
Larinor precession, 400, 401 
I^ast action, 7-10, 12, 20, 560-562 
Least time {see Fermat’s principle) 
Legendre functions, associated, 149, 150, 
584 

I^egendre polynomials, 143-145, 583 
Linear independence, 117n 
Linear manifold, 201 

M 

Macroscopic bodies, identity of, 340 
trajectories of, 331-334 
Magnitude of a function, 116 
Matrices, 348-474 
with continuous elements, 363-366 
Heisenberg, 367 
Heisenberg canonical, 367 
Schrodinger, 367 
Sohrodinger canonical, 369 
similar, 348 
Matrix, adjoint, 350 
determinant of, 350 
diagonal, 284, 349 
Hermitian, or self-adjoint, 350 
of a linear transformation, 101, . 355 



SUBJECT INDEX 


607 


Matrix, of an operator, 352 
ordered diagonal, 349 
reciprocal, 349 
step, 351, 390 
unit, 349 

Matrix addition, 348 
Matrix form of eigenvalue-eigenfunction 
problem, 359 

Matrix multiplication, 349 
Matrix transformation, canonical, 355, 
357, 358 

Maximum-minimum principle, 216?^ 
Mean value, statistical, 72 

of angular-momentum comixments, 
229, 230 

of arbitrary dynamical variable, 242, 
243, 258 

of a coordinate, 50, 219, 220, 270 
of energy, 234-236 
of linear-momentum components, 
221 

for a mixture, 320 
Measurements, 75, 318-347 
complete predictive, 346 
as correlations, 342, 343 
of energy, 328-331, 344, 347 
and identical particles, 335-340 
individual, 318 
of position, 325 

postulates concerning, 323, 324, 326 
predictive, 323 
of radial momentum, 335 
retrospective, 323 
simultaneous, 76, 257, 334 
statistical, 318 
types of, 318, 341, 342 
Minimal sequence, 411 
Mixtures (see Assemblages, mixed case) 
Molecular energy levels, 157, 386, 419- 
426 

Momenta, measurement of individual, 
68, 69 

Momentum, of center of gravity, 63-66 
classical local, 9, 20, 69 
of free par ides, oi)erational definition 
of, 58-60 

imaginary clasdcal local, 81 
measured li lear, 57, 69 
measurement of, in force field, 66-68 
perturbation of, by position measure- 
ment, 59, 66 
radial, 297, '335 - 


Momentum, and wave length, 5, 13 
Young's local, 9 
Momentum operator, 220-224 
Monochromatic waves, 15-18, 78 
Multiply-periodic, 178, 179, 374 

N 

Nodes, 84, 86, 128, 129, 154 
Norm of a function, 1 1 6 
Normal packet function, 43, 174, 175 
Normal properties of an assemblage, 433 
Normalization, 30, 31, 33, 52, 90, 147, 
238, 239, 246 

with spin coordinates, 511, 523 

0 

Observables, 248n, 339 
complete set of type 1, 339 
Observation (sec Measurements) 
Observing n}echanism, 75, 341, 343 -347 
Operational definition, of general dynam- 
ical variable, 318 

of momentum of free particles, 58-60 
Operator, adjoint, 122, 203, 512 
defined by a matrix, 353 
defining physical quantity, 242-245 
definition of, 24n, 203 
Hermitian, 203, 512, 524 
matrix of, 352 
momentum, 220-224 
multiplication, 243, 258, 268-270 
of a positional coordinate, 259 
projection, 261 
self-adjoint, 121-123, 203 
unitary, 247, 275, 512, 524 
unitary integral, 271w 
velocity, 232 

Operators, mutxially compatible, 250, 
254, 286 
symbols for, 244 
of type 1, 254 

as dynamical variables, 248-256 
of type 2, 259 

Orbits of electrons in atoms, 340 
Orthogonality, defined, 118 
of eigendifferentials, 170, 252, 263 
of eigenfunctions, 120, 121, 206, 216 
with respect to density function, 123, 
124, 128 

Orthonormal system of functions, 118 



608 


SUBJECT INDEX 


Oscillator, anhamionic, 81-87, 574, 575 
harmonic, 87-90, 575 
matrix theory of, 370 
selection rules for, 409 

P 

Pauli exclusion principle, 71, 337, 482, 
491, 530, 533-535 
Permutation group, 306, 529 
Permutation operators, 308-310, 336, 529 
Permutations, even and odd, 337 
Perturbation theory, of atoms, 484-488 
520-528 

and continuous spectrum, 394 
convergence of, 382 
for degenerate problems, 388-398 
and integrals of motion, 390-398 
involving the time, 427-432 
matrix form of, 391 
for nondegenerate problems, 380- 388 
Phase velocity, 14, 20 
Physical quantities, defined by operators, 
242-245 

objective existence of, 243, 244 
Piece-by-piece continuous, 113ri, 138 
Piece-by-piece smooth, 113 
Plancherel’s theorem, 30, 63, 170 
Planck ideal linear oscillator, 87-90, 
370, 575 

Planck radiation formula, 1, 448-450 
Poisson bracket symbol, 282 
Population of pure state, 439 
Predissociation of molecules, 109, 195 
Principal-axis transformation, 362, 372, 
373 

Probability, of configuration, 30, 52, 53 
of energy values, 235 
general calculation of, 256-259, 273, 
274, 323 

of momentum range, 62, 63, 67, 69 
Probability amplitude, 227, 236, 245, 
257, 273 

Dirac notation for, 268 
transformation of, 245-248, 270 
Probability current, 189 
Probability density, 220 
Pure case assemblage, or pure case, 55n, 
71 

preparation of, 331 

Pur 0 state (sec Pure case assemblage) 


Q 

Quadratic integrability, 30, 79, 117, 153, 
154, 226 

Quadrupole radiation, electric, 465-469, 
547 

Quantum defect, 480 
Quant\im jumps, 290, 376, 460, 542 
Quantum number, azimuthal, Z, 151, 471 
inner, J, 494 
M, 234, 495, 498 
magnetic, w, 151, 470 
radial, 154 
total, w, 154, 481 
vibrational, Vj 154 

Quantum numbers, angular-momentum, 
151, 234, 493 

of atomic electrons, meaning of, 484- 
491 

half-integral, 89, 95, 155, 157, 317, 
494, 495, 498, 510 
quarter-integral, 1 83 
spin, 494, 498, 500, 511, 527 
Quantum statistical mechani(?s, 55, 432- 
448 

Quasivariable, 297 

R 

Radiation, electric dipole, 450-452, 454- 
460, 469-473 

electric quadrupole, 465-469 
emission and absorption of, 448-469 
magnetic dipole, 462-465 
quantum theory of, 2-7, 450 
Radiation field, classical, 26-29, 452, 454 
Radioactive disintegration, 109, 179, 

187<*192 

Reciprocal, of a matrix, 349 
of an operator, 247, 281 
Reduction of the wave packet, 75, 326- 
333 

Reflection operators, 304-306 
Relativistic theory of hydrogenic atoms, 
507-509 

Relativity principle, 19-21 
Resolution of unity, 261 
Resonance between approximate eigen- 
states, 553 

Resonant energy intervals, 181 
Ritz method, 410-415 
Ritz series formula, 475, 478-481 
Rotation operator, 306, 532 



SUBJECT INDEX 


Rotation-reflection group, 308, 529, 53(2 
Russell-Saunders coupling, 528 
Rydberg constant, 158 
Rydberg series formula, 475 

S 

S complex, 539 

Scalar product, (jonstant in time, 206 
of functions, 114-116, 246 
of vectors, 115 

Schlapp's method, 407, 596, 597 
Schrodinger matrix, 367 
canonical, 369 

Schrodinger wave equation, with external 
electromagnetic field, 26-29 
first form for a single particle, 15 
group of the, 310 

second form for a single particle, 15-17 
for a system of many particles, 21, 196 
Schwarz’s inequality, 116 
Secular equation, 361, 402, 407 
higher roots of, 415 

Selection rule, for harmonic oscillator, 
469 

for /, 494, 543, 547 
for Kf 540-541 

for /, 470, 471, 492, 493, 541-543 

for L, 493, 527, 543-547 

for w, 470, 471, 541-543 

for Af, 543, 545 

for Ml, 543-545 

for S, 627, 543 

for simple Zeeman effect, 472-473 
Self-adjoint differential operator (first 
definition), 121, 130 
Self-adjoint matrix, 351 
Self-adjoint operator (second definition), 
203 

Semiconvergent series, 91 
Separation of variables, 65, 125, 146, 378, 
533 

Series spectra, 474Jf. 

Simultaneous eigenfunctions, existence 
of system of, 284, 285 
Simultaneous measurements and com- 
mutation, 334 

Simultaneously measurable dynamical 
variables, 257 
Single-electron jumps, 542 
Singular domains of a differential equa- 
tion, 79, 202, 208-212 


609 

Singular-point boundary conditions, 125- 
128, 130, 131, 153 

Singular points, of a differential equation, 
79, 124n, 125 
irregular, 142 
isolated, 140 
regular, 141 

solution of differential equation near, 
140-143 

of wave functions, 198n 
Slater wave functions, 535, 536, 555 
Sodium, optical energy levels of, 476 
Sommerfeld phase integral, 91, 95, 103- 
106, 158, 375 

Sommerfeld polynomial method, 87-89, 
158, 159 

Sommerfeld quantum condition, 95, 375 
accuracy of, 106 

Space factor of monochromatic wave 
function, 17 

Spectroscopic stability, 462 
Spectrum of eigenvalues, continuous, 85 
Spectrum lines, intensities of, 377 
Spherical coordinates in six dimensions, 
210, 239, 240 

Spherical harmonics, 145, 147, 592, 593 
Spin-orbit interaction energy, 501-503, 
525 

Spin-spin interaction energy, 530 
Spontaneous ionization, 181, 213 
Stark effect, linear, for hydrogenic atoms, 
403-408, 545 
quadratic and cubic, 408 
State (see Subjective state) 

Statistical equilibrium, 436, 446 
Statistical interpretation of wave theory, 
51-58 

Statistical matrix, 435 
Statistical measurement, 55, 318, 319 
Statistical mechanics, 55 
quantum, 432-448 
Step function, 255,’^26fl 
Step matrix, 363, 390 
Stern-Grerlach experiment, 229 
Stieltjes integral, 261 
Stokes phenomenon, 96», 97, 98, 99 
Stokes region, 98, 576 
Sturm-Idouville equation, 124, 163 
Sturm-Liouville problem, 125, 128 
completeness of system of eigenfunc 
tions of, 137-144 
existence of solution of, 128, 129 



610 


SUBJECT INDEX 


Subjective nature of wave function, 327, 
328, 331 

Subjective state, 52, 54, 69 
Sum-integral operation S, 246 
Surface harmonics, 147 
Symmetric dynamical variables, 337 
Symmetric top, 233, 234 
Symmetry properties of wave equation, 
303-317 

Symmetry values, 304 

Systems of atomic energy levels, 493, 527 

T 

Term-by-term application of operator, 

‘ 175, 176, 274, 275, 592 

Terms, spectroscopic, 530 

originating in a given configuration, 
537-540 

Tesseralharmonics, 149 
Thermodynamic equilibrium, 433 
Thermostat assemblage, 433, 440 
Time-displacement operator, 201, 281 
Top, symmetric, 233, 234 
Transfonn, 24n 
Fourier, 36, 221 
Transformation, canonical, 247 
canonical matrix, 355, 357, 358 
contact, 26, 294 

of dynamical- variable operators, 245- 
248 

of Hamiltonian operator, 237-240 
principal-axis, 362, 372, 373 
of probability amplitudes, 245-248, 270 
of spin matrices, 517-519 
unitary, 247 
Van Vleck, 394 
of wave equation, 64 
Transformation operators, 245, 270, 271 
Transformation theory of Dirac and 
Jordan, 255, 271, 366 
Transition probability, 78 
Bom, 431, 454-458 

Einstein, 431, 449, 451, 458-462, 465, 
469, 545 

Transmission of matter waves through a 
potential hill, 109-112, 180, 575-578 
Transmission coefficient, 111, 112, 188, 
191 

Triple-reflection operator ^, -305, 531, 
541 


“Tunnel effect, “ 109, 180 
Two-particle problem, 146 jf., 175 

U 

Undetermined multipliers of Lagrange, 
580 

Uniqueness, of Hermitian manifold, 263 
of solution of eigenvalue-eigenfunction 
problem, 263 
Unitary matrix, 351 

Unitary operator, 247, 275, 281, 512, 524 
V 

Van Vleck^s method for 8et*/Ond-order 
perturbations, 394 
Variation, of constants, 428 
first, 130, 558 
second, 133n 

Variational forms of eigenvalue-eigen- 
function problem, 130-132, 206, 207, 
408-419, 578-582 

Variational method, with non-orthogonal 
functions, 418 

with nonlinear parameters, 418 
Vector representation of a function, 118, 
119, 356 

Velocity operators, 232 
Virial theorem, 506n 
Volcano model for alpha-particle dis- 
integration, 180 

W 

W. K. B. method {see Brillouiu-WeAtzel- 
Kramers method) 

Wave equation, relativistic, 20, 21 
transformation of, 64 

{See also Schrodinger wave equation) 
Wave functions, analyticity of, 18, 198, 
200 

antisymmetric, 337, 533-536 
determination of, 70-72 
physical interpretation of, 29-33, 82 
physically admissible, 17, 79, 131. 

197-201 
with spin, 524 

as probability amplitudes, 63 
Slater, 535, 536, 555 
subjective nature of, 327, 328 



SUBJECT INDEX 


Wave length, de Broglie, 13, 20 
Compton, 220 

in inhomogeneous medium, 38 
Wave packets, 10-12, 16 
and classical orbits, 331-33r4 
length of, 40 

and Newton^s second law, 49-51 
in one dimension, 37-41 
reduction of, 75, 326-333 
in three dimensions, 41, 42, 48 
Weak quantization, 178-195, 213, 214, 
509 

Weights, of pure states in mixture, 320 
statistical, of energy levels, 448, 531, 
533 


611 

Wronskian determinant, 97, 101, 111, 112, 
590 

X 

X-ray energy levels, 195, 509 
Z 

Zeeman effect, complex, for alkalies, 522 
second-order, 402, 403 
selection rules and polarization for, 
472, 473, 545 
simple, 398-403 





